ip7z / 7zip

7-Zip
458 stars 57 forks source link

Performance regression with gcc-13.2.0? #11

Closed pauljurczak closed 5 months ago

pauljurczak commented 5 months ago

I've run a benchmark on Ubuntu 24.04 with a packaged version (GCC 13.2.0: SSE2) and a version from https://www.7-zip.org/download.html (GCC 9.4.0: SSE2). The former:

paul@cube:~$ taskset -c 3 7z b 3 -mmt1

7-Zip 23.01 (x64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20
 64-bit locale=en_US.UTF-8 Threads:4 OPEN_MAX:1024

 mt1
Compiler: 13.2.0 GCC 13.2.0: SSE2
Linux : 6.6.0-14-generic : #14-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 30 10:27:29 UTC 2023 : x86_64
PageSize:4KB THP:madvise hwcap:2 hwcap2:2
Intel(R) N100 (B06E0) 

1T CPU Freq (MHz):  3384  3369  3386  3383  3390  3388  3388

RAM size:   11690 MB,  # CPU hardware threads:   1 / 4 : 8
RAM usage:    437 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       4914   100   4799   4781  |      45671   100   3892   3899
23:       4314   100   4381   4396  |      45455   100   3934   3935
24:       4021   100   4324   4324  |      44750   100   3922   3929
25:       3827   100   4369   4370  |      43758   100   3896   3895
22:       4897   100   4761   4764  |      45703   100   3901   3902
23:       4311   100   4404   4393  |      45466   100   3934   3936
24:       4028   100   4335   4332  |      44812   100   3932   3934
25:       3826   100   4369   4369  |      43773   100   3896   3896
22:       4883   100   4742   4751  |      45638   100   3901   3897
23:       4299   100   4381   4381  |      45306   100   3925   3922
24:       4015   100   4313   4318  |      44730   100   3922   3927
25:       3815   100   4359   4356  |      43805   100   3896   3899
----------------------------------  | ------------------------------
Avr:      4263   100   4461   4461  |      44906   100   3913   3914
Tot:             100   4187   4188

decompresses much slower (3914 MIPS) than the latter (5983 MIPS):

paul@cube:~/7z2301-linux-x64$ taskset -c 3 ./7zz b 3 -mmt1

7-Zip (z) 23.01 (x64) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20
 64-bit locale=en_US.UTF-8 Threads:4 OPEN_MAX:1024, ASM

 mt1
Compiler: 9.4.0 GCC 9.4.0: SSE2
Linux : 6.6.0-14-generic : #14-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 30 10:27:29 UTC 2023 : x86_64
PageSize:4KB THP:madvise hwcap:2 hwcap2:2
Intel(R) N100 (B06E0) 

1T CPU Freq (MHz):  3349  3369  3379  3372  3387  3389  3388

RAM size:   11690 MB,  # CPU hardware threads:   1 / 4 : 8
RAM usage:    437 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       4898   100   4780   4765  |      70254   100   6003   5998
23:       4313   100   4404   4395  |      69701   100   6039   6033
24:       4024   100   4324   4327  |      66763   100   5861   5861
25:       3825   100   4369   4367  |      66700   100   5964   5937
22:       4893   100   4761   4760  |      69758   100   5961   5956
23:       4319   100   4404   4401  |      69700   100   6018   6033
24:       4034   100   4356   4338  |      68719   100   6017   6033
25:       3822   100   4369   4365  |      66794   100   5940   5945
22:       4895   100   4761   4763  |      70236   100   5982   5997
23:       4320    99   4427   4402  |      69604   100   6018   6025
24:       4034   100   4345   4338  |      68771   100   6017   6037
25:       3823   100   4364   4365  |      66687   100   5940   5936
----------------------------------  | ------------------------------
Avr:      4267   100   4472   4465  |      68641   100   5980   5983
Tot:             100   5226   5224
ip7z commented 5 months ago

There is assembler code for lzma decompression that can be used instead of default C code. asmc program is required: https://github.com/nidud/asmc To compile 7-Zip for x86-64 with asmc assembler:

make -j -f ../../cmpl_gcc_x64.mak
pauljurczak commented 5 months ago

Do you mean that ASM in 64-bit locale=en_US.UTF-8 Threads:4 OPEN_MAX:1024, ASM indicates that I'm not comparing apples to apples?

ip7z commented 5 months ago

Yes, ASM word indicates that assembler code was used. Proabably linux(ubuntu) maintainers for 7zip package don't want to use asmc assembler. asmc is not standard assembeler for GCC/Linux. 7-Zip uses asmc syntax, because asmc is compatible with Windows assember - masm. And original assembler code of 7-Zip was developed for masm (Windows). So asmc allows us to use source code from Windows version also in Linux.

pauljurczak commented 5 months ago

Thank you. This explains the difference in performance.