zats / google-diff-match-patch

Automatically exported from code.google.com/p/google-diff-match-patch
Apache License 2.0
2 stars 8 forks source link

Bad space/time complexity on non-trivial files #26

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Download the attached sample files (two whitespace-simplified versions
of a generated source file in the Hadoop project).
2. Use the library to compute the diff. The diff timeout would have to be
increased significantly or set to 0.0f.

What is the expected output? What do you see instead?
GNU diff computes a diff with about 1260 edit steps in 0.125s on my
machine. diff-match-patch with the diff timeout removed fails to terminate
in both its C++ and Java versions, consuming all available system memory.

What version of the product are you using? On what operating system?
Latest svn trunk on GNU/Linux amd64.

Please provide any additional information below.
The attached files are just an extreme example; I have also found it
infeasible to compute a diff between two files of about 2000 lines each in
Java with a 2G heap. The timeout prevents all memory from being used, but
results in a trivial "delete A, insert B" diff, which is not useful.

Original issue reported on code.google.com by pavgusti...@gmail.com on 20 Jan 2010 at 12:03

Attachments:

GoogleCodeExporter commented 9 years ago
I've also encountered this problem. Are there factors which I can change to 
increase performance?

Original comment by Nathanae...@gmail.com on 3 Feb 2010 at 5:19

GoogleCodeExporter commented 9 years ago
Just released a new version.  The Java and Python versions have been tripled in 
speed, and the JavaScript version has been doubled in speed (for Chrome).

Performance improvements are ongoing.  The C++ version is known to be extremely 
bad due to its use of the QT library and complete lack of pointer math.

Original comment by neil.fra...@gmail.com on 26 Aug 2010 at 4:10

GoogleCodeExporter commented 9 years ago
Issue 20 has been merged into this issue.

Original comment by neil.fra...@gmail.com on 26 Aug 2010 at 4:12

GoogleCodeExporter commented 9 years ago
Issue 29 has been merged into this issue.

Original comment by neil.fra...@gmail.com on 26 Aug 2010 at 4:40

GoogleCodeExporter commented 9 years ago
A huge update landed today that should resolve all the out of memory errors and 
greatly improve performance.  Take a look.

Over the past year and a half the Java and C# versions have improved in speed 
by more than a factor of twenty.

I'm going to close this bug, because although speed improvements will continue 
to be made, I've certainly addressed the examples in this bug and the ones 
merged with it.

Original comment by neil.fra...@gmail.com on 17 Dec 2010 at 4:52