kpdecker / jsdiff

A javascript text differencing implementation.
BSD 3-Clause "New" or "Revised" License
8.1k stars 501 forks source link

How to minimize the number of insertions? #167

Open nachocab opened 7 years ago

nachocab commented 7 years ago

image

The example above showcases what I'm trying to do: the first line has two insertions ("very", "correct and"), while the second line only has one insertion ("and correct"). Is there a parameter I could use to penalize gap opening? That way, the first example would have a single insertion "very and correct"

ExplodingCabbage commented 10 months ago

No such option currently; we don't have any concept of customising edit costs / penalties. Implementing anything along these lines would probably be a pretty major algorithm change, I think (except maybe something that uses this as a way to tiebreak between routes of equal edit distance - but I'm not totally sure that even that is straightforward!).

ExplodingCabbage commented 10 months ago

I'll have a think about this, and will read up on diffing algorithms that allow custom gap penalties (like Smith-Waterman and Needleman-Wunsch) but if I ultimately conclude that this would require a fundamentally different algorithm to the Myers diff algorithm that jsdiff currently uses, I'll probably close this.