facebookarchive / xctool

An extension for Apple's xcodebuild that makes it easier to test iOS and macOS apps.
Apache License 2.0
6.91k stars 738 forks source link

Failures on Mojave running app host tests with `exit code 132` #760

Closed shepting closed 5 years ago

shepting commented 6 years ago

With Xcode 10.1, we've noticed a number of test runs failing with exit code 132. These are consistent on a given machine.

screen shot 2018-11-14 at 2 52 33 pm
shepting commented 6 years ago

We initially hypothesized that it could have been related to the install method of Xcode 10.1 (@charlesmisson).

ExtremeMan commented 6 years ago

Have you tried reinstalling the simulators?

shepting commented 5 years ago

@ExtremeMan We delete and recreate all simulators with fastlane snapshot reset_simulators before every test run.

shepting commented 5 years ago

Does the 132 exit code have any significance? I didn't see any reference to that number on an initial browse of the source code.

shepting commented 5 years ago

@ExtremeMan I could come down to Mountain View sometime to show the issue. It has been failing on ~10% of our jobs, and require a restart of the machine in 90% of the cases (or that machine will keep failing).

shepting commented 5 years ago

It looks as though there may be interactions regarding the way that Buck runs groups of tests in parallel based on initial tests and investigation.

shepting commented 5 years ago
screen shot 2018-12-03 at 1 59 39 pm

I think I may have been able to work around the issue by setting run_test_separately = True in the apple_test rule for the affected jobs.

ExtremeMan commented 5 years ago

It hard to say where is 132 error is coming from. You can most likely find it out in the system or simulator logs.

fastlane snapshot reset_simulators what does it do exactly?

I have setup on our end to run several sanity checks on the machine before actually running tests. All of this is done by invoking simctl utility APIs:

If any of the above steps fails we treat machine as broken. Potentially now you can remove all simulators and retry.

Also do you have only one instance of Xcode installed on the machine?

Another remediation step we have been doing in the past is killing all Xcode launchd services. But I think lately just killing a simulator app results into reseting the whole Xcode and sim environments.

mgrebenets commented 5 years ago

I'm observing same "error 132" consistently as well. I've been having these errors for quite a long time since I upgraded to Mojave. We've been using Xcode 9.4.1 back then but then with 10 and 10.1 it's all the same.

What is interesting, only the tests that need a host app and require iOS keychain access fail with this error. For those tests I have provided a host app. All the other tests run just fine.

However, the very same tests run reliably on OS X High Sierra. So I first attributed it to the Mojave beta issues, but now Mojave is out officially and the problems are still there.

Adding run_test_separately=True to our override of apple_test didn't help.

shepting commented 5 years ago

I can run the xctool command directly to reproduce consistently. For us it's something like

xctool -reporter pretty \
    -sdk iphonesimulator 
    -destination 'name=iPhone 8' \
    run-tests -appTest \
    /usr/local/var/agent/apps-debugging/buck-out/gen/ios/Tests/HappoTests#apple-test-bundle,dwarf,no-include-frameworks,\
    no-linkermap/HappoTests.xctest:/usr/local/var/agent/apps-debugging/buck-out/gen/ios/AirbnbHostedBundle#dwarf,no-include-frameworks,strip-non-global/OurApp.app/OurApp

And the final messages from xctool looks like this:

    ✓ -[HappoTests testHappoRun] (415893 ms)
    3 passed, 0 failed, 0 errored, 3 total (446276 ms)
Illegal instruction: 4

And the exit code is 132. So that's a little more insight into the error from the xctool point of view.

shepting commented 5 years ago

The run_test_separately = True only worked for Buck unit tests. Not these targets that require and app host.

ExtremeMan commented 5 years ago

So is it xctool crashing? What is in syslogs?

shepting commented 5 years ago

Also, I surprisingly noticed that make_release.sh will crash with the similar exit codes:

./make_release.sh: line 53: 49540 Illegal instruction: 4  
   XT_INSTALL_ROOT="$RELEASE_OUTPUT_DIR" "$RELEASE_OUTPUT_DIR"/bin/xctool 
   -sdk macosx run-tests 
   -logicTest "$BUILD_OUTPUT_DIR/Products"/Release/xctool-tests.xctest 
   -parallelize -bucketBy class -logicTestBucketSize 1
Last command: 1m 29.580s   Exit Code: 132
shepting commented 5 years ago

There's nothing in /private/var/log/system.log or when I run syslog is that what you were referring to?

shepting commented 5 years ago

I grabbed a core dump from the crash, but I'm not entirely sure how to interpret the results when opening with lldb.

shepting commented 5 years ago

Just that Illegal instruction: 4 that suggests that perhaps xctool was either built with instructions not valid for this version of macOS or some function is missing a return statement. Hmm.

screen shot 2018-12-05 at 6 03 56 pm
ExtremeMan commented 5 years ago

Can you open Console.app and look for Xcode and xctool crashes in "System Reports" and "User Reports"?

ExtremeMan commented 5 years ago

Also can you try building with https://github.com/facebook/xctool/pull/759?

shepting commented 5 years ago

Ah, yes. Console.app does have an xctool crash in the "User Reports" section.

Also, if I turn on ulimit -c unlimited I can get a core dump when running make_release.sh and get the output:

 [Info] In Progress [xctool-tests.xctest (bucket #14, 7 tests), xctool-tests.xctest (bucket #20, 11 tests)] (0 ms)
./make_release.sh: line 53: 95604 Illegal instruction: 4  (core dumped) XT_INSTALL_ROOT="$RELEASE_OUTPUT_DIR" "$RELEASE_OUTPUT_DIR"/bin/xctool -sdk macosx run-tests -logicTest "$BUILD_OUTPUT_DIR/Products"/Release/xctool-tests.xctest -parallelize -bucketBy class -logicTestBucketSize 1

The crash does seem very similar to Richard's PR, especially the ReadOutputsAndFeedOuputLinesToBlockOnQueue:

Process:               xctool [95604]
Path:                  /private/var/folders/*/xctool
Identifier:            xctool
Version:               ???
Code Type:             X86-64 (Native)
Parent Process:        ??? [94725]
Responsible:           xctool [95604]
User ID:               510

Date/Time:             2018-12-06 10:09:56.716 -0800
OS Version:            Mac OS X 10.14.1 (18B75)
Report Version:        12
Anonymous UUID:        6D22972A-5F8D-C47D-32B8-C691D88D0304

Time Awake Since Boot: 150000 seconds

System Integrity Protection: enabled

Crashed Thread:        2

Exception Type:        EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes:       0x0000000000000001, 0x0000000000000000
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Illegal instruction: 4
Termination Reason:    Namespace SIGNAL, Code 0x4
Terminating Process:   exc handler [95604]

Application Specific Information:
BUG IN CLIENT OF LIBDISPATCH: Unexpected EV_VANISHED (do not destroy random mach ports or file descriptors)

Thread 0:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib          0x00007fff6b72436a __ulock_wait + 10
1   libdispatch.dylib               0x00007fff6b59b5ac _dispatch_ulock_wait + 47
2   libdispatch.dylib               0x00007fff6b59bd43 _dispatch_group_wait_slow + 40
3   xctool                          0x00000001047528c2 -[RunTestsAction runTestables:options:xcodeSubjectInfo:] + 6060 (RunTestsAction.m:972)
4   xctool                          0x00000001047503e0 -[RunTestsAction performActionWithOptions:xcodeSubjectInfo:] + 1553 (RunTestsAction.m:586)
5   xctool                          0x0000000104738cf1 -[XCTool run] + 3720 (XCTool.m:221)
6   xctool                          0x0000000104737428 main + 1508 (main.m:110)
7   xctool                          0x00000001047320e4 start + 52

Thread 1:
0   libsystem_pthread.dylib         0x00007fff6b7da428 start_wqthread + 0
1   ???                             0x0000000054485244 0 + 1414025796

Thread 2 Crashed:
0   libdispatch.dylib               0x00007fff6b5ad2bd _dispatch_source_merge_evt + 165
1   libdispatch.dylib               0x00007fff6b5b588f _dispatch_event_loop_merge + 120
2   libdispatch.dylib               0x00007fff6b5a9f5c _dispatch_workloop_worker_thread + 295
3   libsystem_pthread.dylib         0x00007fff6b7da63c _pthread_wqthread + 409
4   libsystem_pthread.dylib         0x00007fff6b7da435 start_wqthread + 13

Thread 3:
0   libsystem_pthread.dylib         0x00007fff6b7da428 start_wqthread + 0
1   ???                             0xffffffffffffff00 0 + 18446744073709551360

Thread 4:
0   libsystem_pthread.dylib         0x00007fff6b7da428 start_wqthread + 0
1   ???                             0x00017575000186a7 0 + 410620348434087

Thread 5:: Dispatch queue: xctool.runtests
0   libsystem_kernel.dylib          0x00007fff6b721c2a mach_msg_trap + 10
1   libsystem_kernel.dylib          0x00007fff6b722174 mach_msg + 60
2   com.apple.CoreFoundation        0x00007fff3e407da2 __CFRunLoopServiceMachPort + 337
3   com.apple.CoreFoundation        0x00007fff3e4072f1 __CFRunLoopRun + 1654
4   com.apple.CoreFoundation        0x00007fff3e406a28 CFRunLoopRunSpecific + 463
5   com.apple.Foundation            0x00007fff407b4cea -[NSConcreteTask waitUntilExit] + 220
6   xctool                          0x000000010477b529 __LaunchTaskAndFeedSimulatorOutputAndOtestShimEventsToBlock_block_invoke.137 + 31 (TaskUtil.m:423)
7   xctool                          0x0000000104779c4d ReadOutputsAndFeedOuputLinesToBlockOnQueue + 744
8   xctool                          0x000000010477b1e3 LaunchTaskAndFeedSimulatorOutputAndOtestShimEventsToBlock + 606 (TaskUtil.m:428)
9   xctool                          0x0000000104760f40 -[OCUnitOSXLogicTestRunner runTestsAndFeedOutputTo:startupError:otherErrors:] + 326 (OCUnitOSXLogicTestRunner.m:122)
10  xctool                          0x00000001047420ad -[OCUnitTestRunner runTests] + 479 (OCUnitTestRunner.m:209)
11  xctool                          0x0000000104751023 __158-[RunTestsAction 
shepting commented 5 years ago

@ExtremeMan I tried with Richard's PR and it still fails when running make_release.sh in the same way.

ExtremeMan commented 5 years ago

Thanks for debugging, @shepting. Seems like xctool is doing something wrong with file descriptors then.

mgrebenets commented 5 years ago

I've got time to run our test suite on High Sierra just now and they all pass. So it's definitely related to Mojave upgrade.

Not sure how much helpful this is, at least it's definitely not related to Xcode version.

shepting commented 5 years ago

@ExtremeMan This is fixed by https://github.com/facebook/xctool/pull/759 for us. If that PR gets merged (and ideally pushed to Homebrew), we can close this issue.

@mgrebenets You might want to try that other PR as well.

I can also confirm that this issue doesn't occur on High Sierra. It's a Mojave issue.

ExtremeMan commented 5 years ago

Thanks everyone. Looking forward to merging the PR and pushing a new release.

mgrebenets commented 5 years ago

@shepting Is there a way to get xctool builds with jitpack similar to buck?

shepting commented 5 years ago

@mgrebenets I think that someone would just need to add a jitpack.yml file with the relevant steps to the Github repo like Buck does: https://github.com/facebook/buck/blob/master/jitpack.yml

ExtremeMan commented 5 years ago

I have pushed a fix to master (https://github.com/facebook/xctool/commit/b893f431aeac888625a07321770a4bbb6337ce16). Can anyone run a quick test to confirm it actually fixes the issue for them?

mgrebenets commented 5 years ago

Yay! Been waiting for it 🎉 🎉 🎉