MobileNativeFoundation / bluepill

Bluepill is a reliable iOS testing tool that runs UI tests using multiple simulators on a single machine
BSD 2-Clause "Simplified" License
3.19k stars 232 forks source link

Failures in test setup are not retried and missing from stats (tested with EarlGray but possibly XCTest too) #323

Open jerrymarino opened 5 years ago

jerrymarino commented 5 years ago

Good afternoon!

I'm using EarlGrey with Bluepill, and noticed that there was a subset of failures that are never retried or reported.

Here's the stats:

--------------
Run Statistics
--------------
Start Time:           2019-04-09 20:33:31 +0000
End Time:             2019-04-09 20:36:58 +0000
Total execution time: 206.874946 seconds

Times Taken
-----------
[Attempt 1] Create Simulator:                                28.058206 seconds
[Attempt 1] Launch Application:                              0.000000 seconds
[Attempt 1] Run Tests:                                       170.363215 seconds
All Tests:                                                   91.399243 seconds
[Attempt 1] {[Selected tests]}:                              91.399134 seconds
[Attempt 1] {[XX.xctest]}:        91.394505 seconds
[Attempt 1] {[XX 91.388607 seconds
[Attempt 1] Delete Simulator:                                7.257405 seconds

Summary
-------
Total Tests Executed:           0
Failed Tests (includes errors): 0
Error Tests:                    0
Timeout due to test run-time:   0
Timeout due to no output:       0
Retries:                        0
Application Crashes:            0
Simulator Crashes:              0
Simulator Creation Failures:    0
Simulator Deletion Failures:    0
App Install Failures:           0
App Launch Failures:            0

Exit Code: 0

XCTest says:


TestSuite: Selected tests Start: 2019-04-09 20:35:19 +0000 End: 2019-04-09 20:36:08 +0000 Ended: YES Passed: NO Total: 0 Fail: 0 Errors: 0 Time: 49.216000 Log: 
---
XCTestOutputBarrierXCTestOutputBarrier

---

TestSuite: XX.xctest Start: 2019-04-09 20:35:19 +0000 End: 2019-04-09 20:36:08 +0000 Ended: YES Passed: NO Total: 0 Fail: 0 Errors: 0 Time: 49.213000 Log: 
---
(null)
---

TestSuite: XX Start: 2019-04-09 20:35:19 +0000 End: 2019-04-09 20:36:08 +0000 Ended: YES Passed: NO Total: 0 Fail: 0 Errors: 0 Time: 0.000000 Log: 
---
... signficant EarlGrey AX output
---

Any suggestions @bayareabear or @ob ?

ob commented 5 years ago

Hmm... I suspect the output parser is getting confused from EarlGray's output. Do you have an app where this reproduces?

jerrymarino commented 5 years ago

Thanks @ob - it's totally possible, I'm not super familiar with the expected behavior

One other interesting bit from a similar log:

Test Suite 'XX' failed at 2019-04-09 15:26:56.965.
         Executed 0 tests, with 1 failure (1 unexpected) in 0.000 (4.644) seconds
XCTestOutputBarrierTest Suite 'XX.xctest' failed at 2019-04-09 15:26:56.979.
         Executed 0 tests, with 1 failure (1 unexpected) in 0.000 (4.660) seconds
XCTestOutputBarrierTest Suite 'Selected tests' failed at 2019-04-09 15:26:56.981.
         Executed 0 tests, with 1 failure (1 unexpected) in 0.000 (4.667) seconds
XCTestOutputBarrierXCTestOutputBarrier

I really don't get what this means in XCTest terms: 0 tests with 1 failure.

This bluepill invocation was running with --repeat-count 0 so perhaps thats' expected behavior.

I also didn't see any entries in the log file which would trigger -[BPExecutionPhaseProtocol onTestCaseFailed..] or -[BPExecutionPhaseProtocol onTestCasePassed..].

None the less, I ran a similar log file through the test harness and am seeing this:

{87381} 20190409.170742 [34m[  INFO  ][0m (BLUEPILL) BPTree parser completed
{87381} 20190409.170742 [33m[ DEBUG  ][0m (BLUEPILL) [Selected tests] Mismatch numberOfFailures calculated (0) vs reported (1)
{87381} 20190409.170742 [33m[ DEBUG  ][0m (BLUEPILL) [XX.xctest] Mismatch numberOfFailures calculated (0) vs reported (1)
{87381} 20190409.170742 [33m[ DEBUG  ][0m (BLUEPILL) [XXTests] Mismatch numberOfFailures calculated (0) vs reported (1)
{87381} 20190409.170742 [33m[ DEBUG  ][0m (BLUEPILL) [XXTests] Mismatch totalTime calculated (0.000000) vs reported (4.644000)
khandpur commented 5 years ago

that can happen if your +setUp test method throws an exception. For example I see:

Test Suite 'rdar46239726Tests' failed at 2019-04-09 23:06:30.857.
     Executed 0 tests, with 1 failure (1 unexpected) in 0.000 (0.004) seconds
Test Suite 'rdar46239726Tests.xctest' failed at 2019-04-09 23:06:30.858.
     Executed 0 tests, with 1 failure (1 unexpected) in 0.000 (0.005) seconds
Test Suite 'Selected tests' failed at 2019-04-09 23:06:30.859.
     Executed 0 tests, with 1 failure (1 unexpected) in 0.000 (0.006) seconds
jerrymarino commented 5 years ago

Much appreciated @khandpur! Do you have any suggestions for how to make bluepill rerun tests that hit this? It looks like a lot of the cases that hit this issue are happening due exceptions in setup.

ob commented 5 years ago

I think this is a complicated change. AFAICT, Bluepill doesn't have a good place to put failures in test suites (not test cases). I'm not even sure where that failure would fit in the JUnit report, or how the retry mechanism would work... let me think about this and see if I can come up with something 😅