cross-platform-actions / action

Cross-platform GitHub action
MIT License
128 stars 19 forks source link

NetBSD sometimes gets EPIPE errors #21

Closed Slackadays closed 1 year ago

Slackadays commented 1 year ago

Every few runs or so, the Clipboard NetBSD action gets an EPIPE error at Promise. This only happens with NetBSD and doesn't seem to follow any pattern other than being intermittent. However, a re-run has always fixed the problem so far. See https://github.com/Slackadays/Clipboard/actions/runs/4040118288/jobs/6945480282#step:3:66

I hope this bug is fixable because I still think CPA is a genius idea :1st_place_medal:

Slackadays commented 1 year ago

Update: NetBSD has been breaking consistently the past few builds, so something's up.

jacob-carlborg commented 1 year ago

Could you please re-run the job with debug logging enabled [1]?

[1] https://github.blog/changelog/2022-05-24-github-actions-re-run-jobs-with-debug-logging/

Slackadays commented 1 year ago

Looks like the second run worked, so it's back to being intermittent. https://github.com/Slackadays/Clipboard/actions/runs/4040118288/jobs/6955765382

jacob-carlborg commented 1 year ago

Could you try and run with debug info enabled for a while and report back here when it doesn't work?

https://docs.github.com/en/actions/monitoring-and-troubleshooting-workflows/enabling-debug-logging

Slackadays commented 1 year ago

I've added the right debug variable under repository variables, but now NetBSD actions are just spinning their wheels forever. https://github.com/Slackadays/Clipboard/actions/runs/4045627902/jobs/6957357468 I'm not sure if this is CPA's fault or not, so if it ever stops it might be interesting to look at.

Update: It hangs after showing these steps:

...
sent 11,450,131 bytes  received 5,854 bytes  2,082,906.36 bytes/sec
total size is 29,652,931  speedup is 2.59
VM is ready
Run: sudo pkgin -y install cmake gcc12
cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_CXX_COMPILER=/usr/pkg/gcc12/bin/g++ -DCMAKE_C_COMPILER=/usr/pkg/gcc12/bin/gcc
cmake --build . -j 2
sudo cmake --install .
export TMPDIR=/tmp
bash ../tests/suite.sh
/usr/bin/ssh -t runner@localhost sh -c 'cd "/home/runner/work/Clipboard/Clipboard" && exec "bash" -e'
Pseudo-terminal will not be allocated because stdin is not a terminal.

Now failing: https://github.com/Slackadays/Clipboard/actions/runs/4047572932/jobs/6962013350

Now spinning its wheels again

jacob-carlborg commented 1 year ago

I've seen this issue before. The problem seems to be that the action doesn't terminate even the when command has failed, or GitHub Action doesn't pick up that the action has failed and just continue running. Both [1] and [2] are legitimate failures (make is failing). I tried this as well with FreeBSD, failing a build on purpose [3]. You can see on line 621 that the SSH process failed. That's the process that's executing the command inside the VM. That's supposed to fail when the command inside the VM fails. But the action continues running (for some reason) and gets cancelled on line 632 by the GitHub Action timeout.

[1] finally timed out after the default 6 hours. You can now see the failing error. Unfortunately you need to wait for the timeout to see the output, unless you look at it from the start.

For now, as a workaround, I recommend setting a timeout [5]. Based on [4], I suggest a timeout of 20 minutes.

[1] https://github.com/Slackadays/Clipboard/actions/runs/4045627902/jobs/6957357468 [2] https://github.com/Slackadays/Clipboard/actions/runs/4047572932/jobs/6962013350 [3] https://github.com/cross-platform-actions/action/actions/runs/4054629027/jobs/6976689499 [4] https://github.com/Slackadays/Clipboard/actions/runs/4047572932 [5] https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idtimeout-minutes

jacob-carlborg commented 1 year ago

@Slackadays can you try this commit and see if it works better: cc6f40829d8f441a0e271d357353f3f3e15d5ccd. I think you can point to a commit without me having to make a proper release.

Slackadays commented 1 year ago

This one's actually failing earlier now, so it looks like your change could have fixed the issue. https://github.com/Slackadays/Clipboard/actions/runs/4057206961/jobs/6982674945