sergi / go-diff

Diff, match and patch text in Go
MIT License
1.81k stars 207 forks source link

Use common lineHash to share indice between text1 and text2 for correct line diffs #135

Closed nrnrk closed 1 year ago

nrnrk commented 1 year ago

What?

Use common cache of line contents between two texts in DiffLinesToChars to get line diffs correctly.

Why?

In some cases, line diffs cannot be retrieved correctly in the standard way. The following code is one of the examples.

package main

import (
    "fmt"

    "github.com/sergi/go-diff/diffmatchpatch"
)

const (
    text1 = `hoge:
  step11:
  - arrayitem1
  - arrayitem2
  step12:
    step21: hoge
    step22: -93
fuga: flatitem
`
    text2 = `hoge:
  step11:
  - arrayitem4
  - arrayitem2
  - arrayitem3
  step12:
    step21: hoge
    step22: -92
fuga: flatitem
`
)

func main() {
    dmp := diffmatchpatch.New()
    a, b, c := dmp.DiffLinesToChars(text1, text2)
    diffs := dmp.DiffMain(a, b, false)
    diffs = dmp.DiffCharsToLines(diffs, c)
    // DiffCleanupSemantic improves a little but not enough
    // diffs = dmp.DiffCleanupSemantic(diffs)
    fmt.Println(diffs)
}
[{Insert hoge:
  step11:
hoge:
} {Equal hoge:
} {Insert hoge:
} {Equal   step11:
} {Insert hoge:
} {Equal   - arrayitem1
} {Insert hoge:
} {Equal   - arrayitem2
} {Insert hoge:
} {Equal   step12:
} {Insert hoge:
} {Equal     step21: hoge
} {Insert hoge:
} {Equal     step22: -93
} {Delete fuga: flatitem
}]

This fix corresponds to javascript implementation.

Testing?

Add a unit testcase

Anything Else?

This is my first contribution to this repository, therefore I would really appreciate any feedback, suggestions and change requests.