kpdecker / jsdiff

A javascript text differencing implementation.
BSD 3-Clause "New" or "Revised" License
7.92k stars 496 forks source link

Weird bug in result of diff #404

Closed khechoshvili closed 7 months ago

khechoshvili commented 1 year ago

I encountered bug with this code.

const Diff = require('diff');

const one = `<div>beepa boop</div><div>beedp boop</div>`,
other = `<div>beep boop</div><div>beep boop</div><div>beep boop</div>`;

let span = null;

const diff = Diff.diffChars(one, other),
display = document.getElementById('display'),
fragment = document.createDocumentFragment();

diff.forEach((part) => {
  // green for additions, red for deletions
  // grey for common parts
  const color = part.added ? 'green' : part.removed ? 'red' : 'grey';
  span = document.createElement('span');
  span.style.color = color;
  span.appendChild(document.createTextNode(part.value));
  fragment.appendChild(span);
});

display.appendChild(fragment);

See here.

There are three changes which I made in the "other" variable, namely:

However, please see the screenshot. It generates a weird diff: screenshot_from_2023-06-22_13-45-49

It didn't notice removal of "d".

Can someone explain this behavior? (I do have rough explanation why such diff was generated, but still this doesn't seem to be a optimal diff). So is this a bug?

mmitch commented 1 year ago

This does not look like a bug to me, more like "works as designed". diff tries to find the shortest sequence of characters that have to be changed to get from one string to another.

The chosen solution is on the shortest path. even if it looks weird: You have 1 removed character and 19 added characters (notice the d in the middle that is not inserted but reused). That's 20 changed characters in total.

If you count your manual actions, you have:

That's 22 changed characters, which is not an optional solution because it is not the shortest sequence to get from the first string to the second.

ExplodingCabbage commented 7 months ago

Yeah, this is the diff that minimises the number of characters inserted & deleted, which is what diffChars is meant to give you.