lazyhamster / IntChecker

Hash sums calculation/verification plugin for Far Manager.
GNU General Public License v3.0
15 stars 1 forks source link

IntChecker SHA-256 2x slower than `sha256sum` and `rhash` #42

Open przemoc opened 1 year ago

przemoc commented 1 year ago

Far Manager 3.0.0.6116 with IntChecker 2.8.2: Integrity Checker Plugin 2.8.2 - SHA-256 on ggml-medium.en.bin

Git Bash (2.40.1.windows.1):

przemoc@NUC11PHKi7C002 MINGW64 /d/python/whisper-workspace/models/ggml
$ sha256sum --version
sha256sum (GNU coreutils) 8.32
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Ulrich Drepper, Scott Miller, and David Madore.

przemoc@NUC11PHKi7C002 MINGW64 /d/python/whisper-workspace/models/ggml
$ time sha256sum ggml-medium.en.bin
cc37e93478338ec7700281a7ac30a10128929eb8f427dda2e865faa8f6da4356 *ggml-medium.en.bin

real    0m5.382s
user    0m2.406s
sys     0m0.171s

MSYS2 UCRT64:

przemoc@NUC11PHKi7C002 UCRT64 /d/python/whisper-workspace/models/ggml
$ sha256sum --version
sha256sum (GNU coreutils) 8.32
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Ulrich Drepper, Scott Miller, and David Madore.

przemoc@NUC11PHKi7C002 UCRT64 /d/python/whisper-workspace/models/ggml
$ time sha256sum ggml-medium.en.bin
cc37e93478338ec7700281a7ac30a10128929eb8f427dda2e865faa8f6da4356 *ggml-medium.en.bin

real    0m5.300s
user    0m4.531s
sys     0m0.109s

przemoc@NUC11PHKi7C002 UCRT64 /d/python/whisper-workspace/models/ggml
$ rhash --version
RHash v1.4.2

przemoc@NUC11PHKi7C002 UCRT64 /d/python/whisper-workspace/models/ggml
$ time rhash --sha256 ggml-medium.en.bin
cc37e93478338ec7700281a7ac30a10128929eb8f427dda2e865faa8f6da4356  ggml-medium.en.bin

real    0m5.106s
user    0m0.000s
sys     0m0.000s

Tested on: https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-medium.en.bin

As you can see: IntChecker SHA-256 is 2x slower (takes ~10s) than sha256sum and rhash (takes ~5s).

HW: NUC11PHKi7C:

OS:

Px-x64 commented 3 months ago

Did you reboot between each test to be sure that file was not cached in memory on following runs?

przemoc commented 3 months ago

Did you reboot between each test to be sure that file was not cached in memory on following runs?

For hashing speed test you actually want to test it when it is cached, ideally, to make it mostly SHA256 calc benchmark and not disk read + SHA256 calc benchmark. (And my HW specs, provided in previous comment, show that I'm not short on mem or don't have slow disk anyway.)

To avoid suggestions that it is 32-bit vs 64-bit, I retested using recent Far x64 version. Rhash 1.4.4 vs Far 3.0.6300.0 x64 + Integrity Checker Plugin 2.8.2

Px-x64 commented 3 months ago

Did you reboot between each test to be sure that file was not cached in memory on following runs?

For hashing speed test you actually want to test it when it is cached, ideally, to make it mostly SHA256 calc benchmark and not disk read + SHA256 calc benchmark. (And my HW specs, provided in previous comment, show that I'm not short on mem or don't have slow disk anyway.)

I agree, but my point was that it was not clear if the test conditions were the same for all runs :) Anyway, I saw a discussion in russian section of the Far forum, and it looks like author is aware about the issue, and going to solve it one way or another in the next version (no ETA).