Closed chaudum closed 9 years ago
Hey @chaudum, thanks for the suggestion,
I'll be a bit wary here since this actually used to work this way, but had changed.
cc @koenbok — do you know how much faster this had made the build? https://github.com/koenbok/Cactus/commit/fa1a08aed5bc6659304097d5ad7e653c553c1b11
If the speed gain is significant, I guess graciously falling back might be preferable (or possibly using md5sum
).
Cheers,
I changed it so that it uses md5
if available, otherwise fallback to md5sum
.
I'm still a bit concerned considering that we're now creating two subprocesses (which
, then the actual md5[sum]
one) every time we're computing a checksum.
I feel we need to measure this to know what our speed gains are, otherwise we might end up wasting cycles trying to do the smart thing. I'll try to do that over the weekend, unless you'd like to get to it first?
Cheers,
I actually did some tests with this for another project and it's less performant then something like this: http://stackoverflow.com/questions/7829499/using-hashlib-to-compute-md5-digest-of-a-file-in-python-3
So we should change it back. I initially did it because calculating md5s for big files took a ton of memory. Typical example of me trying to be smart and making it less good :-)
@koenbok Thanks! We should probably go with @chaudum's original commit then (https://github.com/chaudum/Cactus/commit/5de20d0cc2cb0f6164f1af10009a09ca9e42a72c)
@chaudum — mind updating the PR with just that one? : )
Thanks!
@krallin @koenbok ok, will update the PR then
made the change slightly more pythonic than the initial commit.
Guys,
Just a quick note on this. This change is probably not what we want. The following code:
for buf in fp.read(65536):
hasher.update(buf)
Doesn't do what we want for two reasons:
65536
chars at maximum.What we want probably looks a lot more like this:
while True:
buf = fp.read(65536)
if not buf:
break
hasher.update(buf)
I found this issue while working on Python 3 compatibility (where iterating over bytestring returns integers, not 1-char bytestrings). ' might add failing test, and will fix the bug.
Cheers,
in order to be able to build website on linux too