Closed bokken closed 5 months ago
These benchmark results are from windows 11 13th Gen Intel(R) Core(TM) i7-1370P . The 3 different files are: ihe_ovly_pr.dcm is a acr/nema dicom format presentation state - sized ~66KB. image1.dcm is a acr/nema dicom formated uncompressed image sized ~26MB. large.xml is an xml file sized ~52MB.
They show regression of up to 10% for the "legacy"/compatibility case. They show performance improvements ranging from ~0% -> ~30% depending on the content and preset level. In general, preset 6 saw larger improvements than preset 3.
jdk 8
Benchmark (file) (preset) Mode Cnt Score Error Units
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 3 avgt 3 0.598 ± 0.065 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 3 avgt 3 0.650 ± 0.195 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 3 avgt 3 0.545 ± 0.097 ms/op
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 6 avgt 3 2.848 ± 0.401 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 6 avgt 3 2.725 ± 0.242 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 6 avgt 3 1.866 ± 0.451 ms/op
XZCompressionBenchmark.baseline image1.dcm 3 avgt 3 2036.300 ± 82.683 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 3 avgt 3 2136.215 ± 1500.816 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 3 avgt 3 1929.697 ± 678.446 ms/op
XZCompressionBenchmark.baseline image1.dcm 6 avgt 3 4124.933 ± 773.943 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 6 avgt 3 4333.171 ± 576.123 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 6 avgt 3 3692.338 ± 162.947 ms/op
XZCompressionBenchmark.baseline large.xml 3 avgt 3 760.670 ± 147.354 ms/op
XZCompressionBenchmark.compress_legacy large.xml 3 avgt 3 787.818 ± 127.012 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 3 avgt 3 694.098 ± 63.862 ms/op
XZCompressionBenchmark.baseline large.xml 6 avgt 3 6601.508 ± 1201.524 ms/op
XZCompressionBenchmark.compress_legacy large.xml 6 avgt 3 7352.783 ± 1461.493 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 6 avgt 3 5279.845 ± 834.122 ms/op
jdk 11
Benchmark (file) (preset) Mode Cnt Score Error Units
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 3 avgt 3 0.590 ± 0.136 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 3 avgt 3 0.587 ± 0.106 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 3 avgt 3 0.542 ± 0.076 ms/op
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 6 avgt 3 2.914 ± 1.259 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 6 avgt 3 2.760 ± 0.254 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 6 avgt 3 1.933 ± 0.157 ms/op
XZCompressionBenchmark.baseline image1.dcm 3 avgt 3 2113.911 ± 815.591 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 3 avgt 3 2197.521 ± 1398.848 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 3 avgt 3 2084.317 ± 904.885 ms/op
XZCompressionBenchmark.baseline image1.dcm 6 avgt 3 4088.532 ± 847.937 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 6 avgt 3 4137.546 ± 88.137 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 6 avgt 3 3991.293 ± 1587.101 ms/op
XZCompressionBenchmark.baseline large.xml 3 avgt 3 699.659 ± 270.399 ms/op
XZCompressionBenchmark.compress_legacy large.xml 3 avgt 3 707.156 ± 165.419 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 3 avgt 3 665.311 ± 36.397 ms/op
XZCompressionBenchmark.baseline large.xml 6 avgt 3 6901.331 ± 9032.103 ms/op
XZCompressionBenchmark.compress_legacy large.xml 6 avgt 3 7338.866 ± 4002.768 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 6 avgt 3 5932.740 ± 383.178 ms/op
jdk 17
Benchmark (file) (preset) Mode Cnt Score Error Units
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 3 avgt 3 0.598 ± 0.029 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 3 avgt 3 0.606 ± 0.054 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 3 avgt 3 0.553 ± 0.247 ms/op
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 6 avgt 3 2.836 ± 0.701 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 6 avgt 3 2.820 ± 0.179 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 6 avgt 3 1.959 ± 0.431 ms/op
XZCompressionBenchmark.baseline image1.dcm 3 avgt 3 2116.691 ± 743.495 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 3 avgt 3 2268.474 ± 667.310 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 3 avgt 3 2150.507 ± 1549.747 ms/op
XZCompressionBenchmark.baseline image1.dcm 6 avgt 3 4144.431 ± 1380.345 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 6 avgt 3 4481.542 ± 1059.997 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 6 avgt 3 4044.453 ± 461.817 ms/op
XZCompressionBenchmark.baseline large.xml 3 avgt 3 739.495 ± 166.726 ms/op
XZCompressionBenchmark.compress_legacy large.xml 3 avgt 3 734.365 ± 137.722 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 3 avgt 3 684.531 ± 89.739 ms/op
XZCompressionBenchmark.baseline large.xml 6 avgt 3 6847.095 ± 904.741 ms/op
XZCompressionBenchmark.compress_legacy large.xml 6 avgt 3 7686.943 ± 2843.518 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 6 avgt 3 5957.280 ± 338.260 ms/op
jdk 21
Benchmark (file) (preset) Mode Cnt Score Error Units
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 3 avgt 3 0.589 ┬▒ 0.064 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 3 avgt 3 0.578 ┬▒ 0.264 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 3 avgt 3 0.543 ┬▒ 0.068 ms/op
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 6 avgt 3 2.842 ┬▒ 2.348 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 6 avgt 3 2.637 ┬▒ 0.621 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 6 avgt 3 1.860 ┬▒ 0.137 ms/op
XZCompressionBenchmark.baseline image1.dcm 3 avgt 3 2013.813 ┬▒ 2111.947 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 3 avgt 3 2022.160 ┬▒ 244.838 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 3 avgt 3 2079.494 ┬▒ 559.869 ms/op
XZCompressionBenchmark.baseline image1.dcm 6 avgt 3 4200.710 ┬▒ 559.429 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 6 avgt 3 4318.551 ┬▒ 224.317 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 6 avgt 3 4000.231 ┬▒ 122.133 ms/op
XZCompressionBenchmark.baseline large.xml 3 avgt 3 741.106 ┬▒ 304.118 ms/op
XZCompressionBenchmark.compress_legacy large.xml 3 avgt 3 762.184 ┬▒ 193.266 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 3 avgt 3 707.872 ┬▒ 217.096 ms/op
XZCompressionBenchmark.baseline large.xml 6 avgt 3 6644.706 ┬▒ 71.733 ms/op
XZCompressionBenchmark.compress_legacy large.xml 6 avgt 3 7490.776 ┬▒ 1352.132 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 6 avgt 3 5125.599 ┬▒ 723.249 ms/op
Having updated the code to both appropriately handle the niceLenLimit in BT4 and adding the bounds check, here are updated benchmarks. This is still Windows 11, but this time an older i7-9850H. I will continue to work on getting results from the 13th Gen Intel(R) Core(TM) i7-1370P listed above.
The benefits still remain, though perhaps a bit more subdued.
JDK 8
Benchmark (file) (preset) Mode Cnt Score Error Units
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 3 avgt 3 1.050 ± 1.455 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 3 avgt 3 0.990 ± 0.282 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 3 avgt 3 0.909 ± 0.033 ms/op
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 6 avgt 3 5.030 ± 0.966 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 6 avgt 3 4.943 ± 0.998 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 6 avgt 3 3.565 ± 0.323 ms/op
XZCompressionBenchmark.baseline image1.dcm 3 avgt 3 2488.136 ± 362.787 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 3 avgt 3 2737.924 ± 3097.448 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 3 avgt 3 2519.283 ± 269.135 ms/op
XZCompressionBenchmark.baseline image1.dcm 6 avgt 3 5622.677 ± 611.597 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 6 avgt 3 5843.140 ± 712.834 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 6 avgt 3 5167.348 ± 159.264 ms/op
XZCompressionBenchmark.baseline large.xml 3 avgt 3 1015.869 ± 31.331 ms/op
XZCompressionBenchmark.compress_legacy large.xml 3 avgt 3 1043.607 ± 64.194 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 3 avgt 3 985.266 ± 135.448 ms/op
XZCompressionBenchmark.baseline large.xml 6 avgt 3 8714.679 ± 375.658 ms/op
XZCompressionBenchmark.compress_legacy large.xml 6 avgt 3 10698.865 ± 661.517 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 6 avgt 3 8009.948 ± 766.887 ms/op
JDK 11
Benchmark (file) (preset) Mode Cnt Score Error Units
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 3 avgt 3 0.958 ± 0.135 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 3 avgt 3 0.995 ± 0.413 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 3 avgt 3 0.872 ± 0.287 ms/op
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 6 avgt 3 4.946 ± 0.695 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 6 avgt 3 4.737 ± 0.719 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 6 avgt 3 3.078 ± 0.142 ms/op
XZCompressionBenchmark.baseline image1.dcm 3 avgt 3 2573.713 ± 884.641 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 3 avgt 3 2601.671 ± 80.804 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 3 avgt 3 2538.038 ± 91.440 ms/op
XZCompressionBenchmark.baseline image1.dcm 6 avgt 3 5735.404 ± 224.716 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 6 avgt 3 5943.590 ± 1239.205 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 6 avgt 3 5039.090 ± 514.206 ms/op
XZCompressionBenchmark.baseline large.xml 3 avgt 3 1007.470 ± 101.786 ms/op
XZCompressionBenchmark.compress_legacy large.xml 3 avgt 3 1025.208 ± 53.206 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 3 avgt 3 948.410 ± 32.021 ms/op
XZCompressionBenchmark.baseline large.xml 6 avgt 3 8843.118 ± 1184.551 ms/op
XZCompressionBenchmark.compress_legacy large.xml 6 avgt 3 10581.770 ± 57.032 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 6 avgt 3 8495.813 ± 1967.609 ms/op
JDK 21
Benchmark (file) (preset) Mode Cnt Score Error Units
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 3 avgt 3 1.089 ┬▒ 0.521 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 3 avgt 3 0.861 ┬▒ 0.304 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 3 avgt 3 0.804 ┬▒ 0.055 ms/op
XZCompressionBenchmark.baseline ihe_ovly_pr.dcm 6 avgt 3 4.712 ┬▒ 0.703 ms/op
XZCompressionBenchmark.compress_legacy ihe_ovly_pr.dcm 6 avgt 3 4.286 ┬▒ 0.793 ms/op
XZCompressionBenchmark.compress_unsafe_long ihe_ovly_pr.dcm 6 avgt 3 3.270 ┬▒ 6.553 ms/op
XZCompressionBenchmark.baseline image1.dcm 3 avgt 3 3969.325 ┬▒ 812.895 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 3 avgt 3 2543.389 ┬▒ 938.026 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 3 avgt 3 2539.038 ┬▒ 703.522 ms/op
XZCompressionBenchmark.baseline image1.dcm 6 avgt 3 6383.656 ┬▒ 141.257 ms/op
XZCompressionBenchmark.compress_legacy image1.dcm 6 avgt 3 5604.145 ┬▒ 428.556 ms/op
XZCompressionBenchmark.compress_unsafe_long image1.dcm 6 avgt 3 5108.071 ┬▒ 357.189 ms/op
XZCompressionBenchmark.baseline large.xml 3 avgt 3 1211.232 ┬▒ 425.227 ms/op
XZCompressionBenchmark.compress_legacy large.xml 3 avgt 3 1016.582 ┬▒ 61.477 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 3 avgt 3 967.770 ┬▒ 267.865 ms/op
XZCompressionBenchmark.baseline large.xml 6 avgt 3 9171.859 ┬▒ 1681.746 ms/op
XZCompressionBenchmark.compress_legacy large.xml 6 avgt 3 9864.887 ┬▒ 970.370 ms/op
XZCompressionBenchmark.compress_unsafe_long large.xml 6 avgt 3 7410.449 ┬▒ 775.903 ms/op
Are you also planning a VarHandles version which doesn’t need the Unsafe method - I think it should apply for this case.
This is incremental changes to simplify code review process. This first set of changes introduces the array mismatch abstraction and initial changes using Unsafe to read unaligned longs for x86 and aarch64.
This is similar to https://github.com/tukaani-project/xz-java/pull/12, but presents a smaller set of changes to simplify review.
Pull request checklist
Please check if your PR fulfills the following requirements:
Pull request type
Please check the type of change your PR introduces: - [ ] Bugfix - [ ] Feature - [ ] Code style update (formatting, renaming, typo fix) - [ ] Refactoring (no functional changes, no api changes) - [ ] Build related changes - [ ] Documentation content changes - [ ] Other (please describe): ## What is the current behavior?Related Issue URL:
What is the new behavior?
-
-
Does this introduce a breaking change?
Other information