langram / node-diff3

This is a node.js library for text diffing and three-way-merge. It originally come from project Synchrotron, created by Tony Garnock-Jones. For more detail please visit: http://homepages.kcbbs.gen.nz/tonyg/projects/synchrotron.html.
MIT License
12 stars 5 forks source link

Error in the diff3_merge module #3

Open deepbasu007 opened 8 years ago

deepbasu007 commented 8 years ago

For certain values of arrays, the diff3_merge module fails to merge them correctly, even when there is no conflict.

Here's an example where issues occur:

a = ["n4100522632", "n4100697091", "n4100697136", "n-10000", "n4102671583", "n4102671584", "n4102671585", "n4102671586", "n4102671587", "n4102671588", "n4102677889", "n4102677890", "n4094374176"]

o = ["n4100522632", "n4100697091", "n4100697136", "n4102671583", "n4102671584", "n4102671585", "n4102671586", "n4102671587", "n4102671588", "n4102677889", "n4102677890", "n4094374176"]

b = ["n4100522632", "n4100697091", "n4100697136", "n4102671583", "n4102671584", "n4102671585", "n4102671586", "n4102671587", "n4102671588", "n4102677889", "n4105613618", "n4102677890", "n4105613617", "n4094374176"]

merged result from diff3_merge = ["n4100522632", "n4100697091", "n4100697136", "n4102671583", "n4102671584", "n4102671585", "n4102671586", "n4102671587", "n4102671588", "n4102677889", "n4105613618", "n4102677890", "n4102671583", "n4102671584", "n4102671585", "n4102671586", "n4102671587", "n4102671588", "n4102677889", "n4102677890", "n4094374176"]

GNU diff3 result = ["n4100522632", "n4100697091", "n4100697136", "n-10000", "n4102671583", "n4102671584", "n4102671585", "n4102671586", "n4102671587", "n4102671588", "n4102677889", "n4105613618", "n4102677890", "n4105613617", "n4094374176"]

As can be seen, the GNU diff3 result is as expected. However, the npm-diff3 fails to merge and creates ambiguities. Any clues as to what's happening here?

tonyg commented 8 years ago

I can reproduce this. It looks like the diff3 part of the code is at fault, because diffComm and diffPatch on o/a and o/b yield sensible-looking results.

deepbasu007 commented 8 years ago

Thank's for looking into it! It seems that the GNU diff works well in this case though I am not sure about other cases. Also, what I found is that when these ambiguities occur, sometimes even conflicts which should be raised are silently ignored. Something weird! Thanks anyway. I will keep hooked on to this thread for any update.

tonyg commented 8 years ago

I found it. It's to do with JavaScript's .sort() function, which sorts by string representation by default. Here's the necessary patch: https://github.com/tonyg/synchrotron/commit/7eb715d08d3c7b479b30b9f413a4a61db9940e82

deepbasu007 commented 8 years ago

Ok I will try this right away! Anyway, so what does x[0]-y[0] do? Isn't it comparing only the first character of the strings? I am sorry but a little confused.

tonyg commented 8 years ago

Each of x and y is a "hunk" -- an array with a handful of fields. The first (index 0) is, I think, the line number of the hunk, but I'm hazy on the details. It was long ago. But I think it's clear that the intention of the sort() call was to order hunks by their first field.

(Try adding console.log(hunks) before and after the sort() to see what's going on.)

implausible commented 8 years ago

Any chance this fix can be pushed as an 0.0.2?