Benchmark plugin swallows error codes from benchmark crashes

clackary commented 9 months ago

We had some benchmarking jobs in CI happily succeeding, but after looking at the logs, it was immediately evident that some benchmark cases were crashing. CI was receiving its zero exit code from the benchmark plugin, even after logging non-zero exit codes somewhere down the line.

I was able to reproduce this behavior in a blank package with the following benchmark case...

import Benchmark
import Foundation

let benchmarks = {
    Benchmark("SomeBenchmark") { benchmark in
        fatalError()
    }
}

Running benchmark plugin does log error messages, but returns 0. Also noting there's a message with error code [5], but I'm not sure where that comes from.

$ swift package --allow-writing-to-package-directory benchmark
Building for debugging...
[1/1] Write swift-version-6E0725D62189FA0A.txt
Build complete! (0.76s)
Building for debugging...
[1/1] Write swift-version-6E0725D62189FA0A.txt
Build complete! (1.59s)
Build complete!
Building BenchmarkTool in release mode...
Building benchmark targets in release mode for benchmark run...
Building SwallowedBenchmark

==================
Running Benchmarks
==================

SwallowedBenchmark/SwallowedBenchmark.swift:8: Fatal error
Process failed: WaitPIDError
Failed to run 'run' for ./.build/arm64-apple-macosx/release/SwallowedBenchmark, error code [5]
Likely your benchmark crashed, try running the tool in the debugger, e.g.
lldb ./.build/arm64-apple-macosx/release/SwallowedBenchmark
Or check Console.app for a backtrace if on macOS.
...

$ echo $?                                                     
0

Running benchmark executable directly returns non-zero exit code.

$ ./.build/arm64-apple-macosx/release/SwallowedBenchmark      
SwallowedBenchmark/SwallowedBenchmark.swift:8: Fatal error
[1]    72363 trace trap  ./.build/arm64-apple-macosx/release/SwallowedBenchmark

$ echo $?
133

I would have expected benchmark plugin to surface some kind of non-zero exit code to indicate a failure. Is this behavior intended?

hassila commented 9 months ago

Thanks for the report - It's basically a regression after https://github.com/ordo-one/package-benchmark/pull/166 was merged, it was partially addressed for check/update/compare operations in https://github.com/ordo-one/package-benchmark/pull/211 - but for the simple iterative use case attached here it won't fail.

I believe it should though, so will make a fix such that any error is always propagated up (but as desired by #166 the rest of the benchmarks will still be run, I believe it was one of your colleagues that asked for that :-) ).

hassila commented 9 months ago

Will be fixed in related PR #231, also see https://github.com/apple/swift-package-manager/issues/7380 which disallows disambiguation between regressions/improvements unfortunately.

clackary commented 9 months ago

Thanks @hassila!

ordo-one / package-benchmark

Benchmark plugin swallows error codes from benchmark crashes #230