actions / runner

The Runner for GitHub Actions :rocket:
https://github.com/features/actions
MIT License
4.86k stars 954 forks source link

`continue-on-error` in an action doesn't suppress an error from a node action called by an action it calls #3510

Open jsoref opened 2 days ago

jsoref commented 2 days ago

Describe the bug

To Reproduce Steps to reproduce the behavior:

  1. workflow has a job (without continue-on-error) with steps: https://github.com/check-spelling-sandbox/turbo-octo-couscous/blob/0d26cf628c8fd56a3dabcea152bec09ec3d5ebca/.github/workflows/run.yml#L5-L7
  2. workflow has a step (without continue-on-error) which calls an action: https://github.com/check-spelling-sandbox/turbo-octo-couscous/blob/0d26cf628c8fd56a3dabcea152bec09ec3d5ebca/.github/workflows/run.yml#L49
  3. which calls an action (with continue-on-error): https://github.com/check-spelling-sandbox/turbo-octo-couscous/blob/0d26cf628c8fd56a3dabcea152bec09ec3d5ebca/call-call-node/action.yml#L6-L7
  4. which calls an action (without continue-on-error) https://github.com/check-spelling-sandbox/turbo-octo-couscous/blob/0d26cf628c8fd56a3dabcea152bec09ec3d5ebca/call-node/action.yml#L29-L35
  5. which is defined to call some node scripts: https://github.com/check-spelling-sandbox/turbo-octo-couscous/blob/0d26cf628c8fd56a3dabcea152bec09ec3d5ebca/node/action.yml
  6. the node post script will die: https://github.com/check-spelling-sandbox/turbo-octo-couscous/blob/0d26cf628c8fd56a3dabcea152bec09ec3d5ebca/node/post.js#L2-L4
  7. trigger the workflow/job/action/action/action/script: https://github.com/check-spelling-sandbox/turbo-octo-couscous/actions/runs/11398060710/job/31715002990

Expected behavior The job should ✅ because the continue-on-error for the intermediate composite action should mask the failure from the inneromst nested composite action that happens to call a node action that happens to fail.

Runner Version and Platform

Version of your runner? Current runner version: '2.320.0'

OS of the machine running the runner? OSX/Windows/Linux/... Linux

What's not working?

Please include error messages and screenshots.

image

Job Log Output

If applicable, include the relevant part of the job / step log output here. All sensitive information should already be masked out, but please double-check before pasting here.

❌ Post call-call-node
##[debug]Evaluating condition for step: 'Post call-call-node'
##[debug]Evaluating: always()
##[debug]Evaluating always:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Post call-call-node
##[debug]Loading inputs
##[debug]Loading env
Post job cleanup.
##[debug]Evaluating condition for step: 'run'
##[debug]Evaluating: always()
##[debug]Evaluating always:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: run
##[debug]Loading inputs
##[debug]Loading env
Post job cleanup.
goodbye cruel world 1

/home/runner/work/turbo-octo-couscous/turbo-octo-couscous/node/post.js:3
    throw "dying";
    ^
dying
(Use `node --trace-uncaught ...` to show where the exception was thrown)

Node.js v20.13.1
##[debug]Node Action run completed with exit code 1
##[debug]Finished: run
##[debug]Finishing: Post call-call-node

Runner and Worker's Diagnostic Logs

If applicable, add relevant diagnostic log information. Logs are located in the runner's _diag folder. The runner logs are prefixed with Runner_ and the worker logs are prefixed with Worker_. Each job run correlates to a worker log. All sensitive information should already be masked out, but please double-check before pasting here.

Runner_20241018-055434-utc.log Worker_20241018-055423-utc.log Worker_20241018-055437-utc.log

[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper] Starting process:
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   File name: '/home/runner/runners/2.320.0/externals/node20/bin/node'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   Arguments: '"/home/runner/work/turbo-octo-couscous/turbo-octo-couscous/./node/post.js"'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   Working directory: '/home/runner/work/turbo-octo-couscous/turbo-octo-couscous'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   Require exit code zero: 'False'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   Encoding web name:  ; code page: ''
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   Force kill process on cancellation: 'False'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   Redirected STDIN: 'False'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   Persist current code page: 'False'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   Keep redirected STDIN open: 'False'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper]   High priority process: 'False'
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper] Updated oom_score_adj to 500 for PID: 1760.
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper] Process started with process id 1760, waiting for process exit.
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper] STDOUT/STDERR stream read finished.
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper] STDOUT/STDERR stream read finished.
[2024-10-18 05:54:40Z INFO ProcessInvokerWrapper] Finished process 1760 with exit code 1, and elapsed time 00:00:00.0253822.
[2024-10-18 05:54:40Z INFO CreateStepSummaryCommand] Step Summary file (/home/runner/work/_temp/_runner_file_commands/step_summary_d59b41e9-fd95-4100-93b1-37a92370fb45) is empty; skipping attachment upload
[2024-10-18 05:54:40Z INFO CompositeActionHandler] Step result: Failed
[2024-10-18 05:54:40Z INFO ExecutionContext] Publish step telemetry for current step {
  "action": "./node",
  "type": "node20",
  "stage": "Post",
  "stepId": "3675990e-bdf9-4065-8979-e6f8641f1719",
  "stepContextName": "__a90cdae5-b2f4-46c1-8e9c-b9e97fadbb8b.node-0-1",
  "isEmbedded": true,
  "errorMessages": []
}.
[2024-10-18 05:54:40Z INFO CompositeActionHandler] Update job result with current composite step result 'Failed'.
[2024-10-18 05:54:40Z INFO CreateStepSummaryCommand] Step Summary file (/home/runner/work/_temp/_runner_file_commands/step_summary_eba9c2ef-42a0-40cf-91ca-2f9550dea84c) is empty; skipping attachment upload
[2024-10-18 05:54:40Z INFO StepsRunner] Step result: Failed
[2024-10-18 05:54:40Z INFO ExecutionContext] Publish step telemetry for current step {
  "action": "./call-call-node",
  "type": "composite",
  "stage": "Post",
  "stepId": "3675990e-bdf9-4065-8979-e6f8641f1719",
  "result": "failed",
  "errorMessages": [],
  "executionTimeInSeconds": 1,
  "startTime": "2024-10-18T05:54:40.636893Z",
  "finishTime": "2024-10-18T05:54:40.6731904Z"
}.
[2024-10-18 05:54:40Z INFO StepsRunner] Update job result with current step result 'Failed'.
[2024-10-18 05:54:40Z INFO StepsRunner] Current state: job state = 'Failed'

The workaround is to include continue-on-error one layer closer, but, it's frustrating that an outer action can't use cotinue-on-error to continue on error.

ChristopherHX commented 2 days ago

Wow, you found another symptom of #2009.

But my PR has never got any review comment..

With my patch applied this is a successful outcome

##[debug]Evaluating condition for step: 'Post call-call-node'
##[debug]Evaluating: always()
##[debug]Evaluating always:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Post call-call-node
##[debug]Begin evaluating template
##[debug]Finished evaluating template
##[debug]Loading inputs
##[debug]Loading env
Post job cleanup.
##[debug]Evaluating condition for step: 'run'
##[debug]Evaluating: always()
##[debug]Evaluating always:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: run
##[debug]Begin evaluating template
##[debug]Finished evaluating template
##[debug]Loading inputs
##[debug]Loading env
Post job cleanup.
##[debug]Evaluating condition for step: 'run'
##[debug]Evaluating: always()
##[debug]Evaluating always:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: run
##[debug]Loading inputs
##[debug]Loading env
Post job cleanup.
goodbye cruel world 1
/home/ubuntu/rt/_work/turbo-octo-couscous/turbo-octo-couscous/node/post.js:3
    throw "dying";
    ^
dying
(Use `node --trace-uncaught ...` to show where the exception was thrown)
Node.js v20.5.0
##[debug]Node Action run completed with exit code 1
##[debug]Finished: run
##[debug]Finished: run
##[debug]Finishing: Post call-call-node

self-hosted runner

ubuntu@ubuntu:~/rt/bin$ ./Runner.Listener run                
√ Connected to GitHub

Current runner version: '3.0.0'
2024-10-18 09:56:45Z: Listening for Jobs
2024-10-18 09:57:15Z: Running job: call-action
2024-10-18 09:58:34Z: Job call-action completed with result: Succeeded