WinMerge / winmerge

WinMerge is an Open Source differencing and merging tool for Windows. WinMerge can compare both folders and files, presenting differences in a visual text format that is easy to understand and handle.
https://winmerge.org/
GNU General Public License v2.0
6.61k stars 804 forks source link

So slow comparison... #274

Closed username1565 closed 4 years ago

username1565 commented 4 years ago

For example, we have two folders: folder1\1.txt

folder2\1.txt folder2\1.JPG folder2\2.JPG ... folder2\1000.JPG

Comparison of folder1 and folder2 - is too slow... When instead JPG, there is .txt or .json, comparison take more time. After comparison, and after click on the file, which does not exists in one folder, this is opened with empty file. So can you make program to just skip comparison, if file not exists in one folder, and don't compare this with empty file, but do this after click?

Also, will be better to just mark the files with the same filenames, but process comparison of this after some time, to be able to working with differences of files, while content of this still is not compared.

Best regards.

MailYouLater commented 4 years ago

WinMerge2011 includes a checkbox in Options > Compare > Folder for "Self-compare unique files". image I'm guessing WinMerge is doing this same 'self-compare', and this option in WinMerge2011 was added to be able to disable it. Perhaps WinMerge should add this option too.

username1565 commented 4 years ago

@MailYouLater, Maybe, I found it in WinMerge v2.16.4.0, in the "Edit -> Options" -> "Folder -> Include unique subfolders contents." winmerge_include_unique_subfolders_contents But this option is not accelerate comparison so effective. Also, I see there "Compare method: Quick contents", but I still don't understand, the following thing: Why the program need to compare one file with empty file, if this file does not exists in the second directory?..

Why this comparison cann't be processed later, on click by this file? This comparison take so much time, when in the folder 10k files, for example... Hehheh...

MailYouLater commented 4 years ago

"Include unique subfolder contents" is a different option, disabling that stops WinMerge from even looking inside a subfolder if it's only present on one side of the folder comparison. I'm not really sure what a 'self-compare' does, but disabling it in WinMerge2011 does notably speed up comparisons like the one you described.

username1565 commented 4 years ago

Anyway need to do something with this, like adding the option "Self-compare unique files", in the latest WinMerge version.

MailYouLater commented 4 years ago

I just dug around a bit, and according a comment by jtuc (the person behind WinMerge2011) "Self-compare unique files ... allows WinMerge to retrieve information about binariness / encoding / EOL type" which is backed up by my experience, and by this code comment I found which describes the option:

/**
 * Self-compare unique files to detect encoding and EOL style.
 *
 * This value is true by default.
 */

(...interestingly, this value appears to have been changed to be false by default, but the comment still says it's true by default...)

Also, I initially hesitated to link to the code since WinMerge2011 switched to GPLv3 and WinMerge is still GPLv2 (with the 'or later' clause) so I wanted to explain what needed to be changed, however despite the change being minor (put a piece of code in an if statement controlled by the option set in the options window) the code is too different between the two projects for me (someone who's pretty new to either project's source) to be sure of what needs to be done, so I kept looking and realized that the git history shows that this option was added when WinMerge2011 was still being distributed under GPLv2, so I'm just going to link to the code change, and the proof that that code was under GPLv2 (with 'or later' clause).

The commit where this feature was added is 0e66cfbd6f0d2467dcf1e66cccaf500861c929f4, the license file present in that commit is for GPLv2 (with 'or later' clause), and here's a link to the pertinent part of the diff: https://bitbucket.org/jtuc/winmerge2011/commits/0e66cfbd6f0d2467dcf1e66cccaf500861c929f4#chg-Src/FolderCmp.cpp

sdottaka commented 4 years ago

If you do not need to compare text files with "Ignore case" option etc enabled, you can use the "Binary Contents method". Using that option should speed up the comparison a bit.

issue#274

username1565 commented 4 years ago

@MailYouLater, I don't understand anything in the source code on C++ Programming Langruage, and and I just using pre-compiled binaries from Releases, to make comparison the differences in old version of nanoboard, and new version - Nanoboard 3.3!!! (C#) So when in the folders so many containers, there is so slow comparison... xD

Maybe, your links say something for @sdottaka, but sorry - this is not for me.

Have a nice day for you all!

MailYouLater commented 4 years ago

Maybe, your links say something for @sdottaka, but sorry - this is not for me.

They were intended for anyone who wants to try their hand at implementing it and may want to reference that info, whether that's @sdottaka, you, me (in the future), or anyone else.

If you do not need to compare text files with "Ignore case" option etc enabled, you can use the "Binary Contents method". Using that option should speed up the comparison a bit.

Will this option skip reading files that exist only in one of the folders being compared? or does it just take less time to do the 'self-compare' because it's not doing some things that occur in the other compare types?

sdottaka commented 4 years ago

Will this option skip reading files that exist only in one of the folders being compared? or does it just take less time to do the 'self-compare' because it's not doing some things that occur in the other compare types?

Yes. This option first compares file sizes. If the files are the same size, the contents of the file are compared, but if the files are not the same size, the contents of the file are not read. For this reason, this option cannot determine the EOL type of a file, determine whether it is text or binary, ignore case, and so on.

username1565 commented 4 years ago

Yeap, I did test it again, and this really working:

1544 files total, to compare. WinMerge 2.16.4.0. Enable comparison in binary mode: 
33      WAIT, SLOW COMPARISON...
58      WAIT, SLOW COMPARISON...
87      WAIT, SLOW COMPARISON...
124     WAIT, SLOW COMPARISON... 
135     WAIT, SLOW COMPARISON... Wanted to stop it, but...
162     WAIT, SLOW COMPARISON... Wanted to stop it, but...
1301        UNEXPECTED JUMPING HERE!!!
1359        WAIT... 
1400        WAIT...
1412        WAIT...
1435        WAIT...
1488        WAIT...
1544        WAIT...
Result: 266 files equal, 1133 on left side only, 7 on right side only, 14 files changed.

Maybe, this just need to enable, by default. But stop... You say about that option is compared file size... And... What about file-size is null for second file, and file not found, in all another modes? This file is compared with empty file, and all previous file is reading line-per-line into the diff with empty file, right? So, why just not to do the disabling this, if file not found or file-length is null, and make this by default, for all another modes? :smile_cat:

Best regards! And... This issue is closed! Nice program!