Send job logs to stdout

vschettino commented 3 years ago

Describe the enhancement I think a good enhancement for self hosted runners would be de capacity to send the logs generated by the job/steps do stdout. This goes very well with ephemeral containers (#510). A broader idea would be to have an arbitrary output to the logs, such as a specific file. This could be implemented alongside the current behavior of sending the logs to GitHub so we can see those in a friendly interface during the job.

Use case

My current use case is to send logs to Cloudwatch, for debugging. Another case I can see is to monitor job errors/warnings on different stacks, such as ELK and Graylog.

Additional information

Since we are here, does anyone know a workaround on this? Seems like there's no well defined path where my logs are stored, so I can't redirect them to stdout using a symlink or something similar.

jwanderson1326 commented 3 years ago

Seconded, trying to find a solution for this. Within self hosted runners, Worker logs are output to _diag/, but these aren't the same logs that are output within Github

drigos commented 3 years ago

I also tried to find the files inside the container, but I couldn't understand the structure.

I was in doubt as GitHub Actions does to obfuscate the secret data. If this is done on GitHub's servers it could explain the reason for not going to stdout.

I haven't tested it yet, but I found these endpoints that may contain the logs, but it wouldn't be so practical for CloudWatch.

https://docs.github.com/en/rest/reference/actions#download-job-logs-for-a-workflow-run https://docs.github.com/en/rest/reference/actions#download-workflow-run-logs

TingluoHuang commented 3 years ago

The secrets mask happened on the runner, not on the server-side.

All job/step logs are located at _diag/pages folder and will get deleted after uploading to the server when each step or the entire job finish.

You can get log via API:

https://docs.github.com/en/rest/reference/actions#download-job-logs-for-a-workflow-run https://docs.github.com/en/rest/reference/actions#download-workflow-run-logs

So, is there any reason you have to get the log from the runner instead of getting it via API?

vschettino commented 3 years ago

Many log tools offer a straightforward Docker integration, such as CloudWatch, Fluentd, Loki and GCP. Those tools use Docker logging drivers that will automatically consume stdout and stream into the choosen plataform. This approach is better than consuming from GitHub API because:

Lower latency (thus useful for real time logs).
Seamlessly integration with Docker container logging drivers.
Easier to catch logs directly into Linux systems (syslogs, journald).

jwanderson1326 commented 3 years ago

Yep agreed with the above, we stream our logs to Datadog in our case

fhammerl commented 1 year ago

As of v2.300.0, setting ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT to 1 or true will also print the logs to stdout in addition to the logfile.

juris commented 1 year ago

Tried setting ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT to true and all I could get was runner diagnostic logs only, not the logs of actual jobs.

IAXES commented 1 year ago

Good day,

A few points I'd like to bring up here.

Re-opening this ticket, and issues with `ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT`.

First, @fhammerl , we should re-open this issue. I can confirm what @juris described. I've attempted, using release v2.303.0, to set ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT to either of 1 or true at the job level, workflow level, and even exporting it in the userdata (i.e. bootstrap logic) for my ephemeral GH runners, which are configured to forward STDOUT to CloudWatch and a secure logging back-end server. The most I see, with the introduction of ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT, is the name of individual jobs being printed, but that's it. We don't see any output from the jobs at all.

Further, I'd like to request/confirm if this could be a global environment variable handled by the runner agent itself. Having this feature as something devops admins could enforce globally would be extremely valuable (e.g. security auditing; enforcing centralized log management for all jobs, etc.). That would be greatly preferable to individual project maintainers having to enforce workflow-level and/or job-level env vars to enable/disable logging to STDOUT/CloudWatch/etc. Further, this being an admin-level feature via env vars supplied to runner agent allows us to make it non-optional.

So, in short: feature does not appear to be working as expected, and I'd like to confirm if this could be set/enforced via an env var supplied to the runner agent itself.

User experience w/ GitHub Actions logs in general

Although the comments/issues noted below are for a separate component of GitHub Actions, they're, in my opinion, very relevant to this feature in the runner. Overall, the user experience with respect to logs in GitHub Actions is generally poor. I'll refer to these announcements/issues/etc.

Log lines for an active step are inaccessible #886; active 2020-12-29 - 2023-01-26.
Log lines for an active step are inaccessible #44250; active 2023-01-19 onward.
Log lines for an active step are inaccessible #2131; active 2022-09-19 onward.

There are numerous additional issues with logs being managed solely by GHES.

Currently, our only centralization option is to write scripts that iterate through runs, curl the log file, send it to a central authority, etc. It works, but it's an ugly hack, IMHO.
Then there's issues with logs no longer being rendered via the web UI if one refreshes their browser or didn't start monitoring the job the moment it started executing (covered in the topics linked above). From the perspective of individual developers, this one (major) UX issue nearly outweighs all the benefits of GitHub Actions (since, for example, "Jenkins already supports this since 10+ years ago").
Ironically, the "download logs" feature is only viable when the job has 100% finished execution, and the resulting log file is nearly empty in the meantime. This means that in both the case of the web UI for viewing logs, and downloading the log file directly, one must always wait until the job is 100% complete to effectively monitor its progress. This just compounds the issue noted directly above.
This forces one to go ssh'ing into boxes/EC2-instances/etc. to manually tail the log files, but depending on how strict the security requirements are for the build environment, most developers will likely not have such access to begin with.
Also, there's the lack of dynamic workflow inputs, leading to another round of recurring comments/questions like "but Jenkins has supported this (Active Choices Parameter) for 10+ years. Why don't we have it?"). Not too keen on Jinja2-templatizing .github/workflow/*.yml.j2 files and auto-pushing them on a cron-job loop every 15 minutes to approximate the feature.

To the point

So, there's an overall point to this:

If ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT can be configured to pump out all the STDOUT/STDERR from entire workflows...
and have it allow users/admins to configure/enforce it at a high level (i.e. env var inherited by the runner agent, as opposed to a workflow-level and/or job-level env var)...
and have it continue to mask/protect secrets and sensitive outputs...

... it solves all of these problems. 😃

We can just pipe our logs to CloudWatch and in-house tools, and never need to use the GitHub Actions web UI again. It's easy enough to trigger jobs via the CLI (not a lot of training needed 👍), or throw together our own web UI to render the output.

Besides the UX issues noted above, GitHub Actions is pretty awesome overall. This proposed change would enable us to overcome all of the UX + delayed-access-to-logs issues noted above. Also, since this feature appears to be in active development, whereas the UX issues seem to be ongoing for 2 years (going on 3), this seems to be the "minimal barrier to entry" to really getting the most out of GitHub Actions as soon as possible.

Thank you.

IAXES commented 1 year ago

As a follow-up to my earlier request/comment months back, it turned out this is pretty easy without relying on the ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT feature. In short, just create an aws_cloudwatch_log_group (i.e. via Terraform/Terragrunt), then:

Create a config json file via a HEREDOC (i.e. in userdata):

CUSTOM_CW_RULE="/opt/aws/pipe_gha_logs.json"
cat << 'EOF' > "$CUSTOM_CW_RULE"
{
  "logs": {
    "logs_collected": {
      "files": {
        "collect_list": [{
          "file_path": "/opt/actions-runner/_diag/**",
          "log_group_name": "${runner_logs}",
          "log_stream_name": "{instance_id}"
        }]
      }
    }
  }
}
EOF

...and then just appending this config to whatever else the amazon-cloudwatch-agent service is handling:

/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
    -a append-config -m ec2 -s \
    -c "file:$CUSTOM_CW_RULE"

Since GHA is already filtering/masking sensitive info, this gives us masked logs in Cloudwatch (making them easier to search, backup, offload, etc.).

pawanbahuguna commented 10 months ago

Can someone please let me know how to use "ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT"? Any documentation or sample config file?

If it's just the environment variable, nothing happens after adding it. Still not able to see the output in the Worker log file. For example: Output of any commands

env: 
  ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT: 1

Hebilicious commented 6 months ago

ACTIONS_RUNNER_PRINT_LOG_TO_STDOUT works when enabled in the runner image itself (see here https://github.com/actions/runner/blob/72559572f64f40554d43cfa04e4128725dc2274b/images/Dockerfile#L39)

However, this is very verbose and some big JSON object are being sent in multiple lines instead of being stringified. Would it be possible to reduce the verbosity or to have log levels ?

actions / runner