kpdecker / jsdiff

A javascript text differencing implementation.
BSD 3-Clause "New" or "Revised" License
8.13k stars 501 forks source link

[BUG?] incompact seperate logic when same character #378

Closed loynoir closed 10 months ago

loynoir commented 2 years ago

Given

Expected

Actual

Additional

If it is not a bug, will be nice to have option to separate at last b, rather than first b.

ExplodingCabbage commented 10 months ago

Hmm. I guess the underlying intuition here is that it's better to preserve the b that's in the same index in the string? So e.g. with diffChars('1 baa 2','1 bbb 2') you WOULD want to preserve the first b?

I think to get a diff that matches your intuition here you probably want to be using a diffing algorithm where edits can be substitutions, like a diff based on Levenshtein distance? If the edits you're making can be substitutions then the single optimal way to convert bbb to aab to is to substitute the first two bs with as (which achieves the transformation with 2 edits). But to the Myers algorithm, which can only do insertions and deletions, it simply doesn't matter which of the three bs you keep; it's the same edit distance (4, made up of 2 insertions and 2 deletions) either way.

Since jsdiff is based on the Myers diff algorithm and that's unlikely to change, I don't think there's a reasonable way for us to make jsdiff behave in the way you wanted, though.