malensek / jbsdiff

Java implementation of the bsdiff algorithm
http://sigpipe.io/jbsdiff
Other
126 stars 41 forks source link

java.lang.StackOverflowError #13

Open keshav0891 opened 6 years ago

keshav0891 commented 6 years ago

Hi,

While using this library, I am getting java.lang.StackOverflowError for one of the file on line SuffixSort.java#L178,190 It is working fine for other files. I used mac's bsdiff utility on the same file and was able to generate patch successfully. Can you suggest something?

~Keshav

Nzen commented 6 years ago

Perhaps SuffixSort.split() could utilize an exclusively iterative, rather than recursive, algorithm if it senses the files are too big. (Or some other quality that would blow the stack. Too many unique elements? This section is pretty terse.) If the files aren't sensitive, could you upload them somewhere?

keshav0891 commented 6 years ago

The file I am running the tests upon is a Tar file which is a port of Firefox for our custom usecase. This tar file is 92 MB in size. I am not sure if i can upload the files publicly. Will revert after consulting with others. Do you have an iterative approach in mind?

Nzen commented 6 years ago

My experiments suggest that file size isn't a limiting feature. Jbsdiff was able to diff 120 MB mp3 files, 120 MB encrypted files, and 180MB files (when I increased max heap past 2GB). I had hoped that there would be some quicker heuristic for choosing between a recursive and iterative strategy. Perhaps something like cortesi's binvis would reveal what is antagonistic about this particular pair of files. (Which is to say, no need to upload them.)

It seems as though SuffixSort.split() needs to be rewritten as an iterative loop to handle keshav0891's use case.

segator commented 6 years ago

you are passing all file to an array you will get outofmemory, you should change it to fileinputstream