Open oconnor663 opened 2 years ago
No, as you have mentioned, the GIL is a big problem. I don't have exact performance measurements at hand right now, but as a rule of thumb: the multithreaded performance is better than single-thread, however, it didn't quite reach the level of performance increase I was hoping for...
I think the performance could be improved substantially by replacing the multi threading part in this code by using multiprocessing instead of multithreading. That should take care of the GIL problem and (hopefully) bring a boost in performance. I don't know how long it would take to modify the code in such a way, I may get around doing it sometimes. This was only an experimental implementation for BLAKE3 in python for a school project, but I stopped actively updating the repository since the school project is over :)
Thank you for the kind words, I really enjoyed learning about BLAKE3 for this project and just hash functions in general and I was surprised to find that BLAKE3 seems to be under-represented with all it's great features.
I hope you succeed in bringing it to Python's hashlib module, I'll be on the look for it!
I haven't seen anyone else getting multithreading working. Are you doing something with the GIL that I can't see? Otherwise it's probably standing in the way of good multicore performance.
I put up a pure Python implementation a few months ago too, but I didn't try to do anything with threads: https://github.com/oconnor663/pure_python_blake3
We're going to try to get BLAKE3 into
hashlib
for Python 3.11. Fingers crossed!