Regressions in System.IO.Compression.Brotli

performanceautofiler[bot] commented 2 years ago

Run Information

Architecture	arm64
OS	Windows 10.0.19041
Baseline	87aa786b3997a23501912abd4654e4fd9958230c
Compare	55bf5d8a28f4a8c519dbc6edfc1d3d64519f530f
Diff	Diff

Regressions in System.IO.Compression.Brotli

Benchmark	Baseline	Test	Test/Base	Test Quality	Edge Detector	Baseline IR	Compare IR	IR Ratio	Baseline ETL	Compare ETL
[Compress - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_arm64_Windows 10.0.19041/System.IO.Compression.Brotli.Compress(level%3a%20Optimal%2c%20file%3a%20%22alice29.txt%22).html>)	57.21 μs	2.64 ms	46.07	0.54	False
[Compress - Duration of single invocation](<https://pvscmdupload.z22.web.core.windows.net/reports/allTestHistory/refs/heads/main_arm64_Windows 10.0.19041/System.IO.Compression.Brotli.Compress(level%3a%20Optimal%2c%20file%3a%20%22TestDocument.pdf%22).html>)	38.71 μs	513.81 μs	13.27	0.57	False

graph graph Test Report

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.IO.Compression.Brotli*'

### Payloads [Baseline]() [Compare]() ### Histogram #### System.IO.Compression.Brotli.Compress(level: Optimal, file: "alice29.txt") ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 2.6358665277777775 > 49.21159596309648. IsChangePoint: Marked as a change because one of 7/15/2022 6:44:31 PM, 7/18/2022 2:17:16 AM falls between 7/8/2022 10:31:34 AM and 7/18/2022 2:17:16 AM. IsRegressionStdDev: Marked as regression because -418.54307728663485 (T) = (0 -2613168.223104056) / Math.Sqrt((13561413.09071954 / (45)) + (334918084.80197054 / (9))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (9) - 2, .025) and -51.65213098762557 = (49630.81596295141 - 2613168.223104056) / 49630.81596295141 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ```#### System.IO.Compression.Brotli.Compress(level: Optimal, file: "TestDocument.pdf") ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 513.8109302995392 > 46.59669667875596. IsChangePoint: Marked as a change because one of 7/15/2022 6:44:31 PM, 7/18/2022 2:17:16 AM falls between 7/8/2022 10:31:34 AM and 7/18/2022 2:17:16 AM. IsRegressionStdDev: Marked as regression because -714.8751641957057 (T) = (0 -514246.41449052736) / Math.Sqrt((9051262.153242894 / (45)) + (2145381.6111228354 / (9))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (9) - 2, .025) and -11.756230764376724 = (40313.351489894725 - 514246.41449052736) / 40313.351489894725 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-io-compression See info in area-owners.md if you want to be subscribed.

Issue Details

### Run Information Architecture | arm64 -- | -- OS | Windows 10.0.19041 Baseline | [87aa786b3997a23501912abd4654e4fd9958230c](https://github.com/dotnet/runtime/commit/87aa786b3997a23501912abd4654e4fd9958230c) Compare | [55bf5d8a28f4a8c519dbc6edfc1d3d64519f530f](https://github.com/dotnet/runtime/commit/55bf5d8a28f4a8c519dbc6edfc1d3d64519f530f) Diff | [Diff](https://github.com/dotnet/runtime/compare/87aa786b3997a23501912abd4654e4fd9958230c...55bf5d8a28f4a8c519dbc6edfc1d3d64519f530f) ### Regressions in System.IO.Compression.Brotli Benchmark | Baseline | Test | Test/Base | Test Quality | Edge Detector | Baseline IR | Compare IR | IR Ratio | Baseline ETL | Compare ETL -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- [Compress - Duration of single invocation]() | 57.21 μs | 2.64 ms | 46.07 | 0.54 | False | | | [Compress - Duration of single invocation]() | 38.71 μs | 513.81 μs | 13.27 | 0.57 | False | | | ![graph]() ![graph]() [Test Report]() ### Repro ```cmd git clone https://github.com/dotnet/performance.git py .\performance\scripts\benchmarks_ci.py -f net6.0 --filter 'System.IO.Compression.Brotli*' ```

### Payloads [Baseline]() [Compare]() ### Histogram #### System.IO.Compression.Brotli.Compress(level: Optimal, file: "alice29.txt") ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 2.6358665277777775 > 49.21159596309648. IsChangePoint: Marked as a change because one of 7/15/2022 6:44:31 PM, 7/18/2022 2:17:16 AM falls between 7/8/2022 10:31:34 AM and 7/18/2022 2:17:16 AM. IsRegressionStdDev: Marked as regression because -418.54307728663485 (T) = (0 -2613168.223104056) / Math.Sqrt((13561413.09071954 / (45)) + (334918084.80197054 / (9))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (9) - 2, .025) and -51.65213098762557 = (49630.81596295141 - 2613168.223104056) / 49630.81596295141 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ```#### System.IO.Compression.Brotli.Compress(level: Optimal, file: "TestDocument.pdf") ```log ``` ### Description of detection logic ```IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsRegressionBase: Marked as regression because the compare was 5% greater than the baseline, and the value was not too small. IsRegressionChecked: Marked as regression because the three check build points were 0.05 greater than the baseline. IsRegressionWindowed: Marked as regression because 513.8109302995392 > 46.59669667875596. IsChangePoint: Marked as a change because one of 7/15/2022 6:44:31 PM, 7/18/2022 2:17:16 AM falls between 7/8/2022 10:31:34 AM and 7/18/2022 2:17:16 AM. IsRegressionStdDev: Marked as regression because -714.8751641957057 (T) = (0 -514246.41449052736) / Math.Sqrt((9051262.153242894 / (45)) + (2145381.6111228354 / (9))) is less than -2.0066468050606243 = MathNet.Numerics.Distributions.StudentT.InvCDF(0, 1, (45) + (9) - 2, .025) and -11.756230764376724 = (40313.351489894725 - 514246.41449052736) / 40313.351489894725 is less than -0.05. IsImprovementBase: Marked as not an improvement because the compare was not 5% less than the baseline, or the value was too small. IsChangeEdgeDetector: Marked not as a regression because Edge Detector said so. ``` ### Docs [Profiling workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/profiling-workflow-dotnet-runtime.md) [Benchmarking workflow for dotnet/runtime repository](https://github.com/dotnet/performance/blob/master/docs/benchmarking-workflow-dotnet-runtime.md)

Author:	performanceautofiler[bot]
Assignees:	DrewScoggins
Labels:	`area-System.IO.Compression`, `Regression`
Milestone:	-

kunalspathak commented 2 years ago

https://github.com/dotnet/runtime/pull/72266 @stephentoub

stephentoub commented 2 years ago

This is a perf test bug. cc: @adamsitnik

The test is assuming that all work associated with a write happens during the write: https://github.com/dotnet/performance/blob/e03024567eb5b6c66e51751c08f0ae1f7f3d3059/src/benchmarks/micro/libraries/System.IO.Compression/CompressionStreamPerfTestBase.cs#L57-L64 That isn't the case. In fact, the higher the compression algorithm, the larger the internal buffer will be used by Brotli, which means at the highest level of compression, the buffer is so large that basically no work happens in the Write call other than copying the data over to the native buffer. And since the stream is never being disposed, the bulk of the work isn't actually happening, and so the test isn't actually measuring what it's supposed to be measuring.

With the test as is, when I run it locally I get similarly terrible-looking results:

Method	Runtime	level	file	Mean	Ratio
Compress	.NET 6.0	Optimal	TestDocument.pdf	15.755 us	1.00
Compress	.NET 7.0	Optimal	TestDocument.pdf	467.284 us	29.89

Compress	.NET 6.0	Optimal	alice29.txt	35.456 us	1.00
Compress	.NET 7.0	Optimal	alice29.txt	2,202.550 us	62.20

But, when I fix the test, by changing: https://github.com/dotnet/performance/blob/e03024567eb5b6c66e51751c08f0ae1f7f3d3059/src/benchmarks/micro/libraries/System.IO.Compression/CompressionStreamPerfTestBase.cs#L62

var compressor = CreateStream(CompressedFile.CompressedDataStream, level);

to instead be:

using var compressor = CreateStream(CompressedFile.CompressedDataStream, level);

and changing: https://github.com/dotnet/performance/blob/e03024567eb5b6c66e51751c08f0ae1f7f3d3059/src/benchmarks/micro/libraries/System.IO.Compression/Brotli.cs#L16

public override Stream CreateStream(Stream stream, CompressionLevel level) => new BrotliStream(stream, level);

to instead be:

public override Stream CreateStream(Stream stream, CompressionLevel level) => new BrotliStream(stream, level, leaveOpen: true);

I then get these results:

Method	Runtime	level	file	Mean	Ratio
Compress	.NET 6.0	Optimal	TestDocument.pdf	457,331.4 us	1.000
Compress	.NET 7.0	Optimal	TestDocument.pdf	1,145.6 us	0.003

Compress	.NET 6.0	Optimal	alice29.txt	203,330.7 us	1.00
Compress	.NET 7.0	Optimal	alice29.txt	3,024.0 us	0.02

which is way more in line with the expected result of my PR.

dotnet / runtime