Just a single (breaking) bugfix, undoing a behaviour change introduced accidentally in 6.0.0:
#554diffWords treats numbers and underscores as word characters again. This behaviour was broken in v6.0.0.
6.0.0
This is a release containing many, many breaking changes. The objective of this release was to carry out a mass fix, in one go, of all the open bugs and design problems that required breaking changes to fix. A substantial, but exhaustive, changelog is below.
#497diffWords behavior has been radically changed. Previously, even with ignoreWhitespace: true, runs of whitespace were tokens, which led to unhelpful and unintuitive diffing behavior in typical texts. Specifically, even when two texts contained overlapping passages, diffWords would sometimes choose to delete all the words from the old text and insert them anew in their new positions in order to avoid having to delete or insert whitespace tokens. Whitespace sequences are no longer tokens as of this release, which affects both the generated diffs and the counts.
Runs of whitespace are still tokens in diffWordsWithSpace.
As part of the changes to diffWords, a new .postProcess method has been added on the base Diff type, which can be overridden in custom Diff implementations.
diffLines with ignoreWhitespace: true will no longer ignore the insertion or deletion of entire extra lines of whitespace at the end of the text. Previously, these would not show up as insertions or deletions, as a side effect of a hack in the base diffing algorithm meant to help ignore whitespace in diffWords. More generally, the undocumented special handling in the core algorithm for ignored terminals has been removed entirely. (This special case behavior used to rewrite the final two change objects in a scenario where the final change object was an addition or deletion and its value was treated as equal to the empty string when compared using the diff object's .equals method.)
#500diffChars now diffs Unicode code points instead of UTF-16 code units.
#508parsePatch now always runs in what was previously "strict" mode; the undocumented strict option has been removed. Previously, by default, parsePatch (and other patch functions that use it under the hood to parse patches) would accept a patch where the line counts in the headers were inconsistent with the actual patch content - e.g. where a hunk started with the header @@ -1,3 +1,6 @@, indicating that the content below spanned 3 lines in the old file and 6 lines in the new file, but then the actual content below the header consisted of some different number of lines, say 10 lines of context, 5 deletions, and 1 insertion. Actually trying to work with these patches using applyPatch or merge, however, would produce incorrect results instead of just ignoring the incorrect headers, making this "feature" more of a trap than something actually useful. It's been ripped out, and now we are always "strict" and will reject patches where the line counts in the headers aren't consistent with the actual patch content.
#435Fix parsePatch handling of control characters.parsePatch used to interpret various unusual control characters - namely vertical tabs, form feeds, lone carriage returns without a line feed, and EBCDIC NELs - as line breaks when parsing a patch file. This was inconsistent with the behavior of both JsDiff's own diffLines method and also the Unix diff and patch utils, which all simply treat those control characters as ordinary characters. The result of this discrepancy was that some well-formed patches - produced either by diff or by JsDiff itself and handled properly by the patch util - would be wrongly parsed by parsePatch, with the effect that it would disregard the remainder of a hunk after encountering one of these control characters.
#439Prefer diffs that order deletions before insertions. When faced with a choice between two diffs with an equal total edit distance, the Myers diff algorithm generally prefers one that does deletions before insertions rather than insertions before deletions. For instance, when diffing abcd against acbd, it will prefer a diff that says to delete the b and then insert a new b after the c, over a diff that says to insert a c before the b and then delete the existing c. JsDiff deviated from the published Myers algorithm in a way that led to it having the opposite preference in many cases, including that example. This is now fixed, meaning diffs output by JsDiff will more accurately reflect what the published Myers diff algorithm would output.
#455The added and removed properties of change objects are now guaranteed to be set to a boolean value. (Previously, they would be set to undefined or omitted entirely instead of setting them to false.)
#464 Specifying {maxEditLength: 0} now sets a max edit length of 0 instead of no maximum.
#467Consistent ordering of arguments to comparator(left, right). Values from the old array will now consistently be passed as the first argument (left) and values from the new array as the second argument (right). Previously this was almost (but not quite) always the other way round.
#480Passing maxEditLength to createPatch & createTwoFilesPatch now works properly (i.e. returns undefined if the max edit distance is exceeded; previous behavior was to crash with a TypeError if the edit distance was exceeded).
#486The ignoreWhitespace option of diffLines behaves more sensibly now.values in returned change objects now include leading/trailing whitespace even when ignoreWhitespace is used, just like how with ignoreCase the values still reflect the case of one of the original texts instead of being all-lowercase. ignoreWhitespace is also now compatible with newlineIsToken. Finally, diffTrimmedLines is deprecated (and removed from the docs) in favour of using diffLines with ignoreWhitespace: true; the two are, and always have been, equivalent.
#490When calling diffing functions in async mode by passing a callback option, the diff result will now be passed as the first argument to the callback instead of the second. (Previously, the first argument was never used at all and would always have value undefined.)
#489this.options no longer exists on Diff objects. Instead, options is now passed as an argument to methods that rely on options, like equals(left, right, options). This fixes a race condition in async mode, where diffing behaviour could be changed mid-execution if a concurrent usage of the same Diff instances overwrote its options.
#518linedelimiters no longer exists on patch objects; instead, when a patch with Windows-style CRLF line endings is parsed, the lines in lines will end with \r. There is now a new autoConvertLineEndings option, on by default, which makes it so that when a patch with Windows-style line endings is applied to a source file with Unix style line endings, the patch gets autoconverted to use Unix-style line endings, and when a patch with Unix-style line endings is applied to a source file with Windows-style line endings, it gets autoconverted to use Windows-style line endings.
#521the callback option is now supported by structuredPatch, createPatch, and createTwoFilesPatch
#529parsePatch can now parse patches where lines starting with -- or ++ are deleted/inserted; previously, there were edge cases where the parser would choke on valid patches or give wrong results.
#533applyPatch uses an entirely new algorithm for fuzzy matching. Differences between the old and new algorithm are as follows:
The fuzzFactor now indicates the maximum Levenshtein distance that there can be between the context shown in a hunk and the actual file content at a location where we try to apply the hunk. (Previously, it represented a maximum Hamming distance, meaning that a single insertion or deletion in the source file could stop a hunk from applying even with a high fuzzFactor.)
A hunk containing a deletion can now only be applied in a context where the line to be deleted actually appears verbatim. (Previously, as long as enough context lines in the hunk matched, applyPatch would apply the hunk anyway and delete a completely different line.)
The context line immediately before and immediately after an insertion must match exactly between the hunk and the file for a hunk to apply. (Previously this was not required.)
#535A bug in patch generation functions is now fixed that would sometimes previously cause \ No newline at end of file to appear in the wrong place in the generated patch, resulting in the patch being invalid.
#535Passing newlineIsToken: true to patch-generation functions is no longer allowed. (Passing it to diffLines is still supported - it's only functions like createPatch where passing newlineIsToken is now an error.) Allowing it to be passed never really made sense, since in cases where the option had any effect on the output at all, the effect tended to be causing a garbled patch to be created that couldn't actually be applied to the source file.
#539diffWords now takes an optional intlSegmenter option which should be an Intl.Segmenter with word-level granularity. This provides better tokenization of text into words than the default behaviour, even for English but especially for some other languages for which the default behaviour is poor.
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps diff from 5.2.0 to 7.0.0.
Changelog
Sourced from diff's changelog.
Commits
ad8f0eb
Prep for 7.0.0 release (#556)7d113b6
Fix diffWords treating numbers and underscores as not being word characters (...f9972d6
Fix links to commit lists in release-notes.md (#555)68526b9
Update release-notes.md (#552)e80648d
Release V6.0.0 (#551)a8b639a
Remove use of regex lookbehind to improve compat with old Safari versions (#550)e8db85e
Bump micromatch from 4.0.5 to 4.0.8 (#549)6b5247d
Bump webpack from 5.90.3 to 5.94.0 (#548)739bfff
Fix reference to no-longer-existent 'rollup.config.js' file (#544)3b1ac53
6.0.0-beta (#543)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show