Open pavel-odintsov opened 10 years ago
Ha, forgot about that.
I did have some horrible buffering algorithm before, but I was trying to make readable code so I just did that :)
I've committed a change that doesn't do that anymore - but it does issue lots of small reads. I also added support for buffering when opening a file (just the 3rd argument to open), so hopefully that's not too slow.
Thank you so much :) You are very kindly!
Just a warning - you might want to write some test cases for this if you actually use it on real data. I tried a few, but who knows what crazy stuff I've done...
Hello, Michael!
I'm compiled (oh, I need for it about 8 hours!) and installed pypy-3 for testing mainline version of pyrdiff.
I run it for 4Gb file and it eat almost all my memory :)
695218 root 20 0 8380m 6.6g 12m R 100.0 1.7 3:47.99 pypy-c
Because of this:
def _generate_delta(signatures, changedfd):
"""Given a file object and signatures from another file,
generate a set of deltas (LiteralChange / CopyChange)"""
buf = changedfd.read() # Just read the whole damn file into memory
Could you fix it in Python-3 branch?
And this code:
buf = buf[offset+blocksize:]
Looks like memory allocator killer.. But I'm not sure about underlying implementation.
Those two are related. I don't have any time to do this work at the moment, but feel free to send a pull request if you make the change.
Hello!
Did it :) Please check this: https://github.com/therealmik/pyrdiff/pull/3 But please review patch very thoroughly because I did not execute tests for enough amount of test cases.
Hello!
I plans to test delta generation performance but found this in your code:
Could you fix this code to real file reading instead putting whole file to memory?
Thank you so much :)