ordo-one / package-benchmark

Swift benchmark runner with many performance metrics and great CI support
Apache License 2.0
326 stars 25 forks source link

Benchmark config thresholds not used when checking 2 baselines #265

Closed aim2120 closed 1 month ago

aim2120 commented 2 months ago

Existing Behavior

When I create two baselines alpha and beta, with some thresholds defined in the Benchmark configurations, those thresholds are not used when I run swift package benchmark check alpha beta. Instead, the default threshold is used.

Expected Behavior

When I define thresholds for baselines and run a check with those baselines, the thresholds I specified are used.

Repro steps

BenchmarkExample.zip

The attached project uses a count environment variable to change the benchmark duration. It also specifies a manual threshold.

import Benchmark
import Foundation

let benchmarks = {
    let metrics: [BenchmarkMetric] = [
        .wallClock,
        .cpuTotal,
    ]
    let thresholds: [BenchmarkMetric: BenchmarkThresholds] = metrics.reduce(into: .init()) { $0[$1] = .p90_200 }
    Benchmark.defaultConfiguration = Benchmark.Configuration(metrics: metrics,
                                                             scalingFactor: .kilo,
                                                             thresholds: thresholds)

    Benchmark("operations") { benchmark in
        for _ in benchmark.scaledIterations {
            let countString = ProcessInfo.processInfo.environment["count"]!
            let count = Int(countString)!
            for _ in 0..<count {
                blackHole(print("", terminator: ""))
            }
        }
    }
}

extension BenchmarkThresholds {
    fileprivate static let p90_200: Self = .init(relative: [
        .p90: 200.0,
    ])
}

Run the following steps to see how running check with 2 baselines uses the default thresholds, while running check with 1 baseline uses the thresholds provided in the configuration.

count=100 swift package --allow-writing-to-package-directory benchmark baseline alpha
count=200 swift package --allow-writing-to-package-directory benchmark baseline beta

# this one will fail, since the default thresholds are used
swift package benchmark baseline check alpha beta

# this one will NOT fail, since the correct thresholds are used
count=200 swift package benchmark baseline check alpha
hassila commented 2 months ago

Thanks for the nice report and reproducer, much appreciated - will unfortunately take some time to have a look at it right now, but will come back, sorry for the inconvenience.

hassila commented 1 month ago

I have this fixed in a branch and seems to work ok now with your test case - hope to gét it up tomorrow. Great if you get the chance to test it then too if possible.

hassila commented 1 month ago

This should be fixed in https://github.com/ordo-one/package-benchmark/pull/284 now, if you can please have a go testing against that (it will be merged fairly soon).

hassila commented 1 month ago

I think this is fixed now, please open a new issue if you see some further issue.