Closed rtfpessoa closed 8 years ago
@escitalopram I did some debugging and the Rematch.distance(amod, bmod)
algorithm is taking too long, and maybe getting an infinite loop or something.
Do you have any idea?
I'll have a look
The problem seems to be triggered by large blocks of changes, like OASIS.csproj having 2,2k lines added and removed in one block. The algorithm is O(nm) time with n lines added and m lines removed in a single block, starting almost 5 million levenshtein distance calculations, which are in turn O(op) time with o,p being the line lengths. I'd suggest we'll just disable the line matching on blocks larger than say n*m=2500 (and maybe make that limit configurable).
The memory hunger will probably go away with that, too, because there is some cache for distance function results. If that isn't enough, maybe I could also introduce some hash function for the cache keys.
I think that is a great idea. Can you make a PR?
Which branch should I base it on?
master
Fixed by #68 in release 2.0.0-beta10
MOVED FROM diff2html-cli#17
HI, I really love the tool and currently running it under windows. however my git diff file is around 300KB, the tool takes 3 hours to finish , without any output file (am using -F option). memory usage is around 800MB.
Just wondering if you have encountered the same issue before?
Tried diffy.org without no issues at all.
https://diffy.org/diff/4wng00ndqz7iudi
thanks. Travis diffReport.txt