actions / runner-images

GitHub Actions runner images
MIT License
10.17k stars 3.06k forks source link

hdiutil failures when creating DMGs on macOS 13 runner #7522

Open wkiefer opened 1 year ago

wkiefer commented 1 year ago

Description

We use https://github.com/create-dmg/create-dmg as a small script wrapper around hdiutil to create and notarize DMG files from our macOS applications.

However, when switching to the macos-13 runner, we get the following error when calling hdiutil create

hdiutil: create: returning 49168
hdiutil: create failed - Resource busy

Platforms affected

Runner images affected

Image version and build link

Image: macos-13
Version: 20230426.3
Included Software: https://github.com/actions/runner-images/blob/macOS-13/20230426.3/images/macos/macos-13-Readme.md
Image Release: https://github.com/actions/runner-images/releases/tag/macOS-13%2F20230426.3

Is it regression?

https://github.com/actions/runner-images/releases/tag/macOS-12%2F20230425.1

Expected behavior

hdiutil create should not return error 49168, resource busy

Actual behavior

hdiutil create consistently returns error 49168, resource busy

Repro steps

I will try to make a public repo sample repro — right now steps are to build with Xcode 14.3 on the macOS 13 image, then run

create-dmg \
  --app-drop-link 240 332 \
  --icon "Sample.app" 240 108 \
  --icon-size 128 \
  --volname "Sample" \
  --window-pos 100 100 \
  --window-size 480 582 \
  --codesign "Your Developer ID Application" \
  "/output/path/Sample.dmg" \
  "/input/folderWithApp/"

Or simply try to run any hdiutil create command in the context of a macOS 13 runner

Alexey-Ayupov commented 1 year ago

Hello @wkiefer, we will take a look, however, please remember macOS 13 is still in beta version.

vpolikarpov-akvelon commented 1 year ago

Hi @wkiefer. I didn't succeed reproducing the issue. Both hdiutil and create-dmg work flawlessly for me. Could you provide more detailed repro steps or share a link to the failing workflow?

geoffthemedio commented 1 year ago

I think we have hit the same issue. https://github.com/freeorion/freeorion/tree/MacOS_action_versions https://github.com/freeorion/freeorion/actions/runs/4902807369/jobs/8755515334?pr=4509#step:9:426

CPack Error: Error executing: /usr/bin/hdiutil create -ov -srcfolder "/Users/runner/work/freeorion/freeorion/build/_CPack_Packages/MacOSX/DragNDrop/FreeOrion_2023-05-06.ba56d96_Test_MacOSX_10.15" -volname "FreeOrion" -fs "HFS+" -format UDZO "/Users/runner/work/freeorion/freeorion/build/_CPack_Packages/MacOSX/DragNDrop/temp.dmg"
CPack Error: Error generating temporary disk image.
hdiutil: create failed - Resource busy

CPack Error: Problem compressing the directory
CPack Error: Error when generating package: FreeOrion
Command PhaseScriptExecution failed with a nonzero exit code

** BUILD FAILED **

The following build commands failed:
    PhaseScriptExecution CMake\ PostBuild\ Rules /Users/runner/work/freeorion/freeorion/build/build/FreeOrion.build/Release/package.build/Script-571234464C2A5D0D43F1AB42.sh (in target 'package' from project 'FreeOrion')
vpolikarpov-akvelon commented 1 year ago

Hey @geoffthemedio, @wkiefer. Looks like the issue is gone as an image version 20230509.4 have been rolled out. Could you try to run your workflow again?

geoffthemedio commented 1 year ago

I reran the action and it completed successfully. I'm not sure that nothing else changed and that it's doing the exact same thing, but at least I can reproduce the issue now.

Neverous commented 1 year ago

I think I'm hitting similar issues or at least issues with hdiutil: hdiutil: create failed - No child processes and the newest image didn't help really :cry:

Runner image:

  Image: macos-13
  Version: 20230509.4
  Included Software: https://github.com/actions/runner-images/blob/macOS-13/20230509.4/images/macos/macos-13-Readme.md
  Image Release: https://github.com/actions/runner-images/releases/tag/macOS-13%2F20230509.4
vpolikarpov-akvelon commented 1 year ago

Hi @Neverous. It looks similar indeed. But I suppose you might get this error because of the failure on the previous step:

error: /Applications/Xcode_14.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/install_name_tool: no LC_RPATH load command with path: /Users/runner/work/efibooteditor/Qt/6.5.0/macos/lib found in: /Users/runner/work/efibooteditor/efibooteditor/build/dist/_CPack_Packages/Darwin/DragNDrop/EFIBootEditor-v0.0.0-3f25f4b38b30f8575f17628808319e6017e44bc7-macos-13-qt-6.5.0/dist/efibooteditor.app/Contents/MacOS/efibooteditor (for architecture x86_64), required for specified option "-delete_rpath /Users/runner/work/efibooteditor/Qt/6.5.0/macos/lib"

Could you ensure that it doesn't affect packaging?

Neverous commented 1 year ago

I get the same thing in macos 12: https://github.com/Neverous/efibooteditor/actions/runs/4908677336/jobs/8925056030#step:12:232 and 11: https://github.com/Neverous/efibooteditor/actions/runs/4908677336/jobs/8925056236#step:12:232 and they go through fine so I would guess that it doesn't, but Ill see If I can do anything about it maybe it will help.

Neverous commented 1 year ago

Yeah, the rpath error is just that both macdeployqt and CPack generator that I use to make the final package are trying to adjust rpaths, and both have no option to skip it from what I can tell... I think its harmless, at least for hdiutil.

Anyway, I got to very simple reproduction steps I think: https://github.com/Neverous/efibooteditor/actions/runs/4997753556/jobs/8952437874 (the issue seems to be somewhat random as just restarting the job sometimes succeeds and sometimes not :thinking: https://github.com/Neverous/efibooteditor/actions/runs/4997791581)

vpolikarpov-akvelon commented 1 year ago

Hey. I finally realized what causes this issue. Looks like it's XProtectBehaviorService that was introduced in macOS 13 Ventura. If you are not lucky enough it may lock your newly created dmg and hdiutil will fail. I can suggest implementing retry in you workflow to workaround this.

Another option may be shutting aforementioned service down (or just killing the process). You may use something like this:

echo killing...; sudo pkill -9 XProtect >/dev/null || true;
echo waiting...; while pgrep XProtect; do sleep 3; done;

I can't recommend doing that because it may cause issues in the future that we can't predict and avoid. Also it may have security related consequences. Use it on your own risk.

Neverous commented 1 year ago

Tried both, neither works for me, I must be hitting something slightly different then :thinking: (hdiutil: create failed - No child processes seems not to be so random like the Resource busy one, but so far I wasn't able to create simple reproduction steps than running the whole build)

retrying 20 times killing XProtect

vpolikarpov-akvelon commented 1 year ago

You a right, Maciej, this is something different. Killing XProtect only helps when you get "Resource busy". I still suspect that previous error may somehow cause failure of hdiutil. I understand that it didn't matter on macOS 11 & 12, but something else may have changed so now it's important. Could you try to get rid of "install_name_tool" error? Just to be sure that it's really nothing.

Neverous commented 1 year ago

Ah should've mentioned that with macdeployqt vs CPack comment, did that (disabled macdeployqt for tests, leaving only cpack dmg generation): https://github.com/Neverous/efibooteditor/commit/e7ec720c531000d1ea36a486c269463f798e23e9#diff-1e7de1ae2d059d21e1dd75d5812d5a34b0222cef273b7c3a2af62eb747f9d20aL410 , https://github.com/Neverous/efibooteditor/actions/runs/4997683758/jobs/8952293026#step:12:234

vpolikarpov-akvelon commented 1 year ago

Did you try enabling debug output for hdiutil? I suppose it may be done by setting CPACK_COMMAND_HDIUTIL option to /path/to/hdiutil -debug

Neverous commented 1 year ago

Don't see anything interesting unfortunately, just that diskimages-helper died :disappointed: logs

2023-05-25 17:08:25.255 hdiutil[11275:37357] [DIHelperProxy watchForHelperDeath] helper exited early
2023-05-25 17:08:25.255 hdiutil[11275:37357] helper died
2023-05-25 17:08:25.255 hdiutil[11275:37357] setHelperDoneWithResult: _helperDone = YES, _threadResultsError = 10
vpolikarpov-akvelon commented 1 year ago

Hey @Neverous. Looks like your issue may be resolved by running hdiutil using sudo.

I added this line to CMakeLists.txt and packaging works flawlessly:

set(CPACK_COMMAND_HDIUTIL "/usr/bin/sudo /usr/bin/hdiutil")
Neverous commented 1 year ago

Yes, thank you! It works indeed. Interesting, some permissions changed with macOS 13?, not sure why it would now require elevated access :thinking: but it packages successfully now with sudo :bow: .

vpolikarpov-akvelon commented 1 year ago

Glad to hear! Well, looks like problems have been solved so i'm closing this issue for now. Feel free to reopen it if problems return.

To sum up:

grzegorzkrukowski commented 1 year ago

This is still a case on latest macOS13 runners - it's happening 80% of the time and non of mentioned workaround here is helping

PatTheMav commented 1 year ago

I don't agree that this issue is actually resolved given that XProtect seems to (seemingly at random) lock down created disk images on CI runners. Killing the process was first "not recommended" but has seemingly been elevated to the accepted solution for that issue.

With macOS 14' release happening soon I would expect a lot of developers racing to have updates ready for it which might exacerbate this issue (i.e. if the volume of unseen application bundles and disk images might trigger XProtect more often).

jpfeuffer commented 1 year ago

We have the same issue and killing a security relevant system process is definitely not a valid solution!

jpfeuffer commented 1 year ago

https://github.com/OpenMS/OpenMS/actions/runs/5931058665/job/16082111148

vpolikarpov-akvelon commented 1 year ago

Hey @jpfeuffer. Unfortunately, it is the only solution we have. Mac OS doesn't support automation well. The process we are talking about wouldn't bother you probably if you build your software by hand, but when you configure automation, process becomes fast enough for you to start observing this and similar phenomena. To workaround this you may either kill process that malicious process, configure retries or pauses between steps.

jpfeuffer commented 1 year ago

The problem is that your solutions are not working solutions: /path/to/hdiutil -debug does not work because hdiutil expects the "action" first. So no debugging inside CPack which can only prepend arguments. sudo does not help. And neither does the killing of the process before running CTest/CPack.

See https://github.com/OpenMS/OpenMS/actions/runs/5938356857/job/16102624563

jcelerier commented 1 year ago

also seeing this in azure pipelines more and more: https://dev.azure.com/ossia/libossia/_build/results?buildId=3542&view=logs&j=7bab896a-24f8-544f-51eb-43745367a332&t=0a75dee1-c6f9-5e3c-8b1c-a50cc807cb5c

tsteven4 commented 10 months ago

we have been having the same issue. It is intermittent but not rare. In our case hdiutil is being called from Qt's macdeployqt. A recent failures is https://github.com/GPSBabel/gpsbabel/actions/runs/7251472805/attempts/1

ERROR: Bundle creation error: "hdiutil: create failed - Resource busy\n"

tsteven4 commented 10 months ago

our failures are on macos12.

apparentsoft commented 10 months ago

I'm getting this intermittently as well. This is for a rather large DMG, of more than 150MB

fwcd commented 8 months ago

Linking the upstream issue for completeness: https://gitlab.kitware.com/cmake/cmake/-/issues/25671

echlebek commented 5 months ago

I think this issue should be re-opened. Observed this happening in this CI run: https://github.com/SumoLogic/sumologic-otel-collector/actions/runs/9415385897/job/25936308280

It's good that there is a workaround, but I don't think it's a long-term resolution for this issue.

sarathrajsrinivasan commented 2 months ago

Hi @wkiefer / @echlebek ,

We have reopened the issue and on initial validations we could see that macOS-14 seems more stable when using the create-dmg wrapper . We are currently re-investigating macOS-13 and will keep you posted with the updates.

sarathrajsrinivasan commented 2 months ago

Hi @wkiefer / @echlebek ,

We were able to reproduce the issue with using the create-dmg wrapper.

The default filesystem used in the create-dmg wrapper is HFS+. After adding the below option to specify the image filesystem as APFS we could see our pipelines are able to generate the dmg consistently.

create-dmg \
  ... \
  --filesystem APFS \
  ... \

Could you please implement the same in your pipelines and let us know if it helped with the intermittent failures.

jcelerier commented 2 months ago

Still getting some intermittent failures with --filesystem APFS. For instance: hdiutil: resize: failed. No child processes (10)

sarathrajsrinivasan commented 2 months ago

Hi @jcelerier,

Could you please share some of the failed workflows for validation. Also what percentage of jobs are failing now?

jcelerier commented 2 months ago

@sarathrajsrinivasan here's some workflows : relevant job is macOS build :

so, the failure ratio right now is ~1/3 but there's not much data

sarathrajsrinivasan commented 2 months ago

Thanks @jcelerier .

hdiutil: resize: failed error is different from the earlier resource busy error we were facing. Will check on it. Could you please retry few more times and let us know if it is persistent.

jcelerier commented 2 months ago

@sarathrajsrinivasan still getting some of the previous failures:

https://dev.azure.com/ossia/libossia/_build/results?buildId=4003&view=logs&j=7bab896a-24f8-544f-51eb-43745367a332&t=0a75dee1-c6f9-5e3c-8b1c-a50cc807cb5c&l=139

Unmounting disk image...
hdiutil: couldn't eject "disk2" - Resource busy
Wait a moment...
Unmounting disk image...
hdiutil: couldn't eject "disk2" - Resource busy
Wait a moment...
Unmounting disk image...
hdiutil: couldn't eject "disk2" - Resource busy
Galkon commented 2 months ago

Started getting issues on macos-14 runner today:

  ⨯ Exit code: 6. Command failed: hdiutil resize -size 1753905584.2 /private/var/folders/4d/0gnh84wj53j7wyk695q0tc_80000gn/T/t-qPw43P/0.dmg\n' +
    'hdiutil: resize: failed. Device not configured (6)\n' +
    '\n' +
    'hdiutil: resize: failed. Device not configured (6)\n' +
    '  failedTask=build stackTrace=Error: Exit code: 6. Command failed: hdiutil resize -size 1753905584.2 /private/var/folders/4d/0gnh84wj53j7wyk695q0tc_80000gn/T/t-qPw43P/0.dmg\n' +
    'hdiutil: resize: failed. Device not configured (6)\n' +
    '                                                                                                                                                                                                                                                       hdiutil: resize: failed. Device not configured (6)
jcelerier commented 2 months ago

Another spurious issue:

killing...
waiting...
Creating disk image...
created: /Users/runner/work/1/s/install/rw.25997.score.dmg
hdiutil: resize: failed. No child processes (10)

https://dev.azure.com/ossia/libossia/_build/results?buildId=4024&view=logs&j=7bab896a-24f8-544f-51eb-43745367a332&t=0a75dee1-c6f9-5e3c-8b1c-a50cc807cb5c

aaneeley commented 2 months ago

Been having this issue (inconsistently) while using electron-builder

⨯ Exit code: 1. Command failed: hdiutil attach -noverify -noautoopen -readwrite /private/var/folders/py/lcjn3y352g1106vf1rqk521r0000gn/T/t-XlxQaO/0.dmg
hdiutil: attach failed - Device not configured

hdiutil: attach failed - Device not configured
  failedTask=build stackTrace=Error: Exit code: 1. Command failed: hdiutil attach -noverify -noautoopen -readwrite /private/var/folders/py/lcjn3y352g1106vf1rqk521r0000gn/T/t-XlxQaO/0.dmg
hdiutil: attach failed - Device not configured

hdiutil: attach failed - Device not configured

On runner:

Current runner version: '2.319.1'
Operating System
  macOS
  14.6.1
Runner Image
  Image: macos-14-arm64
Galkon commented 2 months ago

Been having this issue (inconsistently) while using electron-builder

⨯ Exit code: 1. Command failed: hdiutil attach -noverify -noautoopen -readwrite /private/var/folders/py/lcjn3y352g1106vf1rqk521r0000gn/T/t-XlxQaO/0.dmg
hdiutil: attach failed - Device not configured

hdiutil: attach failed - Device not configured
  failedTask=build stackTrace=Error: Exit code: 1. Command failed: hdiutil attach -noverify -noautoopen -readwrite /private/var/folders/py/lcjn3y352g1106vf1rqk521r0000gn/T/t-XlxQaO/0.dmg
hdiutil: attach failed - Device not configured

hdiutil: attach failed - Device not configured

On runner:

Current runner version: '2.319.1'
Operating System
  macOS
  14.6.1
Runner Image
  Image: macos-14-arm64

I was having a lot of issues with macos-14 runners in general, they have gone away since I found the things mentioned in this thread and implemented them: https://github.com/actions/runner-images/issues/10511#issuecomment-2327934827

nuttyartist commented 1 month ago

We're having the same issue at: https://github.com/nuttyartist/notes/pull/700#issuecomment-2381604764

srcejon commented 1 hour ago

I also see "hdiutil: create failed - Resource busy" on macos-13.

kill -9 XProtect doesn't seem to help.

macos-12 build seems to work OK.