pnedev / comparePlus

Compare plugin for Notepad++
GNU General Public License v3.0
984 stars 140 forks source link

Implement cross-diff similarities check #395

Open marcofbbr opened 1 month ago

marcofbbr commented 1 month ago

Hi

If you compare the attached files, you will notice that the first column of subnets on the left is the same but the plugin does not recognize it.

If you delete the line which includes the word "Network" on one of the 2 files, then the comparison works as expected.

Screenshot 2024-07-30 160648

new 25.txt new 24.txt

EDIT reason: rephrased.

pnedev commented 1 month ago

Hello @marcofbbr ,

That is perfectly normal because the "Network..." lines match perfectly and provide a context for the local sub-matches. Currently if the algorithm finds matching lines/sections then the other diffs are split in two sub-groups - below and after the matching sections. Those sub-groups are consequently matched for similarities. That means that the order of lines in the files actually matter - if you move the "Network..." line in one of the files to match the position it has in the other file things will also change. In your compare case you'll need to "help" a bit the comparison to actually find the differences you are particularly interested in (by ignoring certain lines/symbols or by reordering the lines after the first comparison to change the compare context according to your needs).

P.S. If you don't have further comments or suggestions I will close this issue thread after a week.

BR

marcofbbr commented 1 month ago

Thanks for your feedback. I understand your logic. Please note that I am not a coder so I am giving my feedback trying to improve the accuracy for my specific case hoping that perhaps this could be useful on a general level. Sometimes it is a bit tricky or time consuming to "help" the plugin if the output is very long . Wouldn't be possible/worth to do an additional check/compare for such cases If the algorithm would compare, in addtion to what you already described, the initial (or even internal) part of each string marked with +/- it could notice that there are some similarities . For example : Line 8 in the picture, left side, is very similar to line 7 right side. Based on that, the plugin could inform the user with an icon about a misaligned similarity. So the icons in the output I provided would be a new icon instead of a misleading + and -. Potentially a scale of mismatching similarity (high, mid, low) could be defined based on the percentage of text matching. Such info can be included in the icon.

Perhaps another approach to avoid such condition (at least in this case) could be to eliminate the perfect matches automatically or with user interaction from both files and run the compare again . Not sure it would be worth to develop such feature, but in my specific case it would fix the problem

I hope this helps.

pnedev commented 1 month ago

Thank you for the feedback and the suggestions. :+1: