dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.08k stars 4.69k forks source link

Regressions in System.IO.Compression.Brotli #73391

Open performanceautofiler[bot] opened 2 years ago

performanceautofiler[bot] commented 2 years ago

Run Information

Architecture arm64
OS Windows 10.0.19041
Baseline 87aa786b3997a23501912abd4654e4fd9958230c
Compare 55bf5d8a28f4a8c519dbc6edfc1d3d64519f530f
Diff Diff

Regressions in System.IO.Compression.Brotli

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio Baseline ETL Compare ETL
[Compress - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_arm64_Windows 10.0.19041/System.IO.Compression.Brotli.Compress(level%3a%20Optimal%2c%20file%3a%20%22alice29.txt%22).html>) 57.21 μs 2.64 ms 46.07 0.54 False
[Compress - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_arm64_Windows 10.0.19041/System.IO.Compression.Brotli.Compress(level%3a%20Optimal%2c%20file%3a%20%22TestDocument.pdf%22).html>) 38.71 μs 513.81 μs 13.27 0.57 False

graph graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.IO.Compression.Brotli*'
### Payloads [Baseline]() [Compare]() ### Histogram #### System.IO.Compression.Brotli.Compress(level: Optimal, file: "alice29.txt") ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 2.6358665277777775 > 49.21159596309648. IsChangePoint: Marked as a change because one of 7/15/2022 6:44:31 PM, 7/18/2022 2:17:16 AM falls between 7/8/2022 10:31:34 AM and 7/18/2022 2:17:16 AM. IsRegressionStdDev: Marked as regression because -418.54307728663485 (T) = (0 -2613168.223104056) / Math.Sqrt((13561413.09071954 / (45)) + (334918084.80197054 / (9))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (9) - 2, .025) and -51.65213098762557 = (49630.81596295141 - 2613168.223104056) / 49630.81596295141 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ```#### System.IO.Compression.Brotli.Compress(level: Optimal, file: "TestDocument.pdf") ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 513.8109302995392 > 46.59669667875596. IsChangePoint: Marked as a change because one of 7/15/2022 6:44:31 PM, 7/18/2022 2:17:16 AM falls between 7/8/2022 10:31:34 AM and 7/18/2022 2:17:16 AM. IsRegressionStdDev: Marked as regression because -714.8751641957057 (T) = (0 -514246.41449052736) / Math.Sqrt((9051262.153242894 / (45)) + (2145381.6111228354 / (9))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (9) - 2, .025) and -11.756230764376724 = (40313.351489894725 - 514246.41449052736) / 40313.351489894725 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-io-compression See info in area-owners.md if you want to be subscribed.

Issue Details
### Run Information Architecture | arm64 -- | -- OS | Windows 10.0.19041 Baseline | [87aa786b3997a23501912abd4654e4fd9958230c](https://github.com/dotnet/runtime/commit/87aa786b3997a23501912abd4654e4fd9958230c) Compare | [55bf5d8a28f4a8c519dbc6edfc1d3d64519f530f](https://github.com/dotnet/runtime/commit/55bf5d8a28f4a8c519dbc6edfc1d3d64519f530f) Diff | [Diff](https://github.com/dotnet/runtime/compare/87aa786b3997a23501912abd4654e4fd9958230c...55bf5d8a28f4a8c519dbc6edfc1d3d64519f530f) ### Regressions in System.IO.Compression.Brotli Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [Compress - Duration of single invocation]() | 57.21 μs | 2.64 ms | 46.07 | 0.54 | False | | | [Compress - Duration of single invocation]() | 38.71 μs | 513.81 μs | 13.27 | 0.57 | False | | | ![graph]() ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.IO.Compression.Brotli*' ```
### Payloads [Baseline]() [Compare]() ### Histogram #### System.IO.Compression.Brotli.Compress(level: Optimal, file: "alice29.txt") ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 2.6358665277777775 > 49.21159596309648. IsChangePoint: Marked as a change because one of 7/15/2022 6:44:31 PM, 7/18/2022 2:17:16 AM falls between 7/8/2022 10:31:34 AM and 7/18/2022 2:17:16 AM. IsRegressionStdDev: Marked as regression because -418.54307728663485 (T) = (0 -2613168.223104056) / Math.Sqrt((13561413.09071954 / (45)) + (334918084.80197054 / (9))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (9) - 2, .025) and -51.65213098762557 = (49630.81596295141 - 2613168.223104056) / 49630.81596295141 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ```#### System.IO.Compression.Brotli.Compress(level: Optimal, file: "TestDocument.pdf") ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 513.8109302995392 > 46.59669667875596. IsChangePoint: Marked as a change because one of 7/15/2022 6:44:31 PM, 7/18/2022 2:17:16 AM falls between 7/8/2022 10:31:34 AM and 7/18/2022 2:17:16 AM. IsRegressionStdDev: Marked as regression because -714.8751641957057 (T) = (0 -514246.41449052736) / Math.Sqrt((9051262.153242894 / (45)) + (2145381.6111228354 / (9))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (9) - 2, .025) and -11.756230764376724 = (40313.351489894725 - 514246.41449052736) / 40313.351489894725 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)
Author: performanceautofiler[bot]
Assignees: DrewScoggins
Labels: `area-System.IO.Compression`, `Regression`
Milestone: -
kunalspathak commented 2 years ago

https://github.com/dotnet/runtime/pull/72266 @stephentoub

stephentoub commented 2 years ago

This is a perf test bug. cc: @adamsitnik

The test is assuming that all work associated with a write happens during the write: https://github.com/dotnet/performance/blob/e03024567eb5b6c66e51751c08f0ae1f7f3d3059/src/benchmarks/micro/libraries/System.IO.Compression/CompressionStreamPerfTestBase.cs#L57-L64 That isn't the case. In fact, the higher the compression algorithm, the larger the internal buffer will be used by Brotli, which means at the highest level of compression, the buffer is so large that basically no work happens in the Write call other than copying the data over to the native buffer. And since the stream is never being disposed, the bulk of the work isn't actually happening, and so the test isn't actually measuring what it's supposed to be measuring.

With the test as is, when I run it locally I get similarly terrible-looking results:

Method Runtime level file Mean Ratio
Compress .NET 6.0 Optimal TestDocument.pdf 15.755 us 1.00
Compress .NET 7.0 Optimal TestDocument.pdf 467.284 us 29.89
Compress .NET 6.0 Optimal alice29.txt 35.456 us 1.00
Compress .NET 7.0 Optimal alice29.txt 2,202.550 us 62.20

But, when I fix the test, by changing: https://github.com/dotnet/performance/blob/e03024567eb5b6c66e51751c08f0ae1f7f3d3059/src/benchmarks/micro/libraries/System.IO.Compression/CompressionStreamPerfTestBase.cs#L62

var compressor = CreateStream(CompressedFile.CompressedDataStream, level);

to instead be:

using var compressor = CreateStream(CompressedFile.CompressedDataStream, level);

and changing: https://github.com/dotnet/performance/blob/e03024567eb5b6c66e51751c08f0ae1f7f3d3059/src/benchmarks/micro/libraries/System.IO.Compression/Brotli.cs#L16

public override Stream CreateStream(Stream stream, CompressionLevel level) => new BrotliStream(stream, level);

to instead be:

public override Stream CreateStream(Stream stream, CompressionLevel level) => new BrotliStream(stream, level, leaveOpen: true);

I then get these results:

Method Runtime level file Mean Ratio
Compress .NET 6.0 Optimal TestDocument.pdf 457,331.4 us 1.000
Compress .NET 7.0 Optimal TestDocument.pdf 1,145.6 us 0.003
Compress .NET 6.0 Optimal alice29.txt 203,330.7 us 1.00
Compress .NET 7.0 Optimal alice29.txt 3,024.0 us 0.02

which is way more in line with the expected result of my PR.