cross-platform-actions / action

Cross-platform GitHub action
MIT License
140 stars 19 forks source link

FreeBSD jobs occassionally fail when ejecting the disk #64

Closed yorickpeterse closed 1 year ago

yorickpeterse commented 1 year ago

In the last few weeks I've been observing FreeBSD jobs failing when trying to eject a disk, with the error being that the resource is busy. https://github.com/inko-lang/inko/actions/runs/6520144623/job/17707484871 is one such job where this happened:

2023-10-14T22:02:10.0083480Z Waiting for partitions to activate
2023-10-14T22:02:10.0365870Z Formatting disk2s1 as MS-DOS (FAT32) with name RES
2023-10-14T22:02:20.8279870Z 512 bytes per physical sector
2023-10-14T22:02:20.8280650Z /dev/rdisk2s1: 76594 sectors in 76594 FAT32 clusters (512 bytes/cluster)
2023-10-14T22:02:20.8282080Z bps=512 spc=1 res=32 nft=2 mid=0xf8 spt=32 hds=16 hid=2048 drv=0x80 bsec=77824 bspf=599 rdcl=2 infs=1 bkbs=6
2023-10-14T22:02:20.9370120Z Mounting disk
2023-10-14T22:02:24.9493860Z Finished partitioning on disk2
2023-10-14T22:02:25.0276190Z [command]/usr/bin/sudo umount /Volumes/RES
2023-10-14T22:02:26.1238370Z [command]/usr/bin/hdiutil detach /dev/disk2
2023-10-14T22:02:27.0086020Z hdiutil: couldn't eject "disk2" - Resource busy
2023-10-14T22:02:27.0565870Z 
2023-10-14T22:02:27.0569630Z /Users/runner/work/_actions/cross-platform-actions/action/v0.19.1/webpack:/cross-platform-action/node_modules/@actions/exec/lib/toolrunner.js:574
2023-10-14T22:02:27.0571900Z                 error = new Error(`The process '${this.toolPath}' failed with exit code ${this.processExitCode}`);
2023-10-14T22:02:27.0573230Z ^
2023-10-14T22:02:27.0573860Z Error: The process '/usr/bin/hdiutil' failed with exit code 16
2023-10-14T22:02:27.0576500Z     at ExecState._setResult (/Users/runner/work/_actions/cross-platform-actions/action/v0.19.1/webpack:/cross-platform-action/node_modules/@actions/exec/lib/toolrunner.js:574:1)
2023-10-14T22:02:27.0579510Z     at ExecState.CheckComplete (/Users/runner/work/_actions/cross-platform-actions/action/v0.19.1/webpack:/cross-platform-action/node_modules/@actions/exec/lib/toolrunner.js:557:1)
2023-10-14T22:02:27.0584190Z     at ChildProcess.<anonymous> (/Users/runner/work/_actions/cross-platform-actions/action/v0.19.1/webpack:/cross-platform-action/node_modules/@actions/exec/lib/toolrunner.js:451:1)
2023-10-14T22:02:27.0586440Z     at ChildProcess.emit (node:events:513:28)
2023-10-14T22:02:27.0587160Z     at maybeClose (node:internal/child_process:1100:16)
2023-10-14T22:02:27.0588120Z     at Socket.<anonymous> (node:internal/child_process:458:11)
2023-10-14T22:02:27.0588840Z     at Socket.emit (node:events:513:28)
2023-10-14T22:02:27.0589440Z     at Pipe.<anonymous> (node:net:301:12)

A restart of the job typically fixes it, though sometimes a second or third retry may be necessary. I'm not sure yet how to reliably reproduce it, as it happens sporadically.

jacob-carlborg commented 1 year ago

I've started to see this myself as well. Also for OpenBSD. Not sure what to do about it. Catch the error and try the command again perhaps.

yorickpeterse commented 1 year ago

The last week or so I've been noticing more and more of these failures. Sometimes a single retry is enough, other times I have to retry several times before the job succeeds. Some sort of retry mechanism with a small wait time (e.g. 5 times with a 30 sec wait time) would be much appreciated :smiley:

jacob-carlborg commented 1 year ago

Looks like it was possible to fix/mitigate this issue by retrying detaching the disk: https://github.com/cross-platform-actions/action/actions/runs/6749533591/job/18350090001#step:3:236. Fixed in https://github.com/cross-platform-actions/action/releases/tag/v0.21.1.

yorickpeterse commented 1 year ago

@jacob-carlborg Thanks! I'll give it a try! :tada: