Open jumper444 opened 2 years ago
I also changed the compare to RO (read only) for both source locations on the theory that maybe there was some sort of locking or full file operation happening in normal mode and that checking 'read only' would force the program to only pull 1Mx200 file 'tops' quickly. That made no difference. The binary compare is 30min+ for a data amount that should only be 200M from each USB stick and finish in seconds.
I had a strange conclusion that would be causing this, but I can't on earth see how it was programmed this way. The way this could be happening is if the "compare limits" are implemented so they are sample amounts throughout a file (like you sample a population in statistics). Such as the process in english being this: "If compare limit is 64M then the entire files are read and 64M of data interspersed within the files are SAMPLED and compared" (instead of the top 64M of each file). And thus if you change to 1M compare limit it still reads the entire files but then only samples 1M instead of 64M. I'm sure this isnt' what is happening...is it? But why isn't the program just quickly pulling the top 1M of each file and finishing in seconds?
I mentioned both binary and quick compare in the issue title, but my description above was only about binary compare. Yes, I attempted a quick compare also (and with minimal 1M limit) and, again, there seemed to be no speed improvement and the program seemed to be processing/reading all files fully (even if ultimately only comparing the small limit).
update: I allowed the read only binary compare to finish: 1M compare limit, 200 files, on 2 USB sticks (one USB3 and one USB2): "Elapsed time: 3770603 ms" (all files identical)
I then did a windows command prompt binary file compare "fc /b usb1FileX usb2FileX" where FileX is one of those 200 files and is about 200M in size. It took 10 seconds. Thus, my view is that Winmerge should be taking, at most, about 15 seconds (factoring in some overhead and multiple files) vs over an hour for what it is doing.
I'm sorry to be so misleading, but "binary compare limit" does not mean what you think it means.
There is a "Binary contents method" that allows faster comparisons than the "Full contents method" and "Quick contents method." The size specified in "Binary compare limit" means the file size to switch from "Full contents method" or "Quick contents method" to "Binary contents method" in order to compare large files at high speed.
the "Binary compare limit" is misleading, so I think we need to improve the wording.
I then did a windows command prompt binary file compare "fc /b usb1FileX usb2FileX" where FileX is one of those 200 files and is about 200M in size. It took 10 seconds. Thus, my view is that Winmerge should be taking, at most, about 15 seconds (factoring in some overhead and multiple files) vs over an hour for what it is doing.
The default "full contents" compare method is a multi-threaded comparison. If the file size is large, it will switch to the binary contents compare method according to the "binary compare limit", but the multi-threading will continue. This multi-threaded comparison is done to speed up the diff calculation, but it may slow down the process if the I/O load is high.
If the Binary contents compare method is specified from the beginning, the comparison will be performed in a single thread.
Could you see if the comparison speed changes when you change from the "Full Contents" compare method to the "Binary Contents" compare method as shown below?
Sorry no reply yet from me due to holidays. Please a few more days...
The issue is that WinMerge sneaks into archives. It goes into .ISO and there is no way to turn off.
Also signin up on Github is a mess to. Github is not a userfreandly support forum.
'm aware that on this angrty post nothing will happen, but I can't do the compare I would need, and no idea how WinMerge filters would inhibit sneaking into archives (archiv support turned off *2 course.
Thats also why you get my unfreandly rubbish user name.
(using winmerge 2.16.16.0, x86)
ISSUE: I believe winmerge is reading or processing full file data even when, during a compare, it is only set to use a small 'compare limit' and thus SHOULD only be reading the 'top' x MB of a file.
SPECIFICS: Comparing about 40G of files on two USB sticks. Most of the files are very large in size (200M-500M) and there are only a comparative small number of 200 files.
I mention USB sticks to indiate a data transfer constraint for this issue report (which might cause the problem to not be noticable on a direct PCI/SSD situation). USB can transfer data fine, but a complete read of 40G from 2 sticks would normally take time, as expected (esp when one is USB2 which one of them is).
A full content compare of this situation is probably 30 minutes or more (I haven't waited to find out). I started one and it went very slowly and I stopped. But I saw the speed as it was proceeding, which is important.
I then went to settings and changed to binary compare (and left the 64M compare limit) and started again. SAME SPEED, by all measures. Dropping from full to 64M didn't seem to improve speed. So I thought 64Mx200=12G might still be too high for me to notice an improvement, so I will change the binary compare limit to a minimal 1M.
1Mx200=200M (times 2 usb sticks) Taking the 'top' 1M of 200 files and pulling it over USB (even USB2) from 2 sticks should take SECONDS.
BUG/ISSUE: the speed of the compare hasn't seemingly changed at all. I'm running a binary (1M limit) compare on 200files as I type this and it is taking similar speeds to a full read 40Gx2 data operation (by rough appearances. I don't have exact timings here but 30 min to compare vs seconds is blatent.)
ASSERTION: I do not believe winmerge is just pulling the 'top' 1M of files. Somehow it is processing/reading the full files (and then once they get to the CPU from the USB probably just operating on the 1M compare limit), but still pulling/reading the entire file first through the USB bus.
SETTINGS of the compare i'm using now which is extremely slow as though it was a full compare (a setting not mentioned is unchecked): Binary compare method, Include Subfolders, Binary Compare limit(MB) 1, whitespace ignore all
Is there an explanation (or something I'm doing wrong) as to why pulling 200x1M=200M of total data from two USB sticks for binary compare is processing at a rate of almost 30minutes to complete (other than somehow winmerge is fully reading each file regardless of "compare limit" setting?