bmwill / diffy

Tools for finding and manipulating differences between files
Apache License 2.0
75 stars 22 forks source link

Patching fails if either patch text or base text have Windows line endings (CRLF) #20

Open Qyriad opened 2 years ago

Qyriad commented 2 years ago

This crate seems to fail if either the patch text or the text being patched contain Windows line endings. For example, the following test cases:

let text_lf = "Allomancy\nFeruchemy\n";
let text_crlf = "Allomancy\r\nFeruchemy\r\n";

let patch_lf =
    "\
    --- original\n\
    +++ modified\n\
    @@ -1,2 +1,3 @@\n Allomancy\n Feruchemy\n+Hemalurgy\n";

let patch_crlf =
    "\
    --- original\r\n\
    +++ modified\r\n\
    @@ -1,2 +1,3 @@\r\n Allomancy\r\n Feruchemy\r\n+Hemalurgy\r\n";

let p_lf = Patch::from_str(&patch_lf).unwrap();
let p_crlf = Patch::from_str(&patch_crlf);

assert!(diffy::apply(text_lf, &p_lf).is_ok());   // OK
assert!(p_crlf.is_ok());                         // ParsePatchError("invalid char in unquoted filename")
assert!(diffy::apply(text_crlf, &p_lf).is_ok()); // ApplyError(1)

The latter two asserts fail. Would you be open to supporting Windows line endings in this crate? I'd be happy to PR if one is welcome.

bmwill commented 2 years ago

Thanks for the bug report! Yes I do think that diffy should better handle windows line endings although I think that fixing one of those asserts is more straight forward than the other.

let p_crlf = Patch::from_str(&patch_crlf).unwrap()

Parsing a patch with CRLF line endings should succeed. Taking a quick look at this it seems like theres an easy fix for this in the patch header parsing logic.

assert!(diffy::apply(text_crlf, &p_lf).is_ok());

This one is a bit more complicated. Applying a patch which has different line endings from the text that is being patched is something that would require a bit more smarts on the matching of where the patch should be applied. Some other tools seem to either fail to apply (e.g git-apply) while others do some fuzzy-matching to get the patch to apply (e.g. unix `patch) but then leave the patched with with mixed line endings. So I'm not quite sure what the best way to tackle this would be.

Fixing the first issue, though, would let you do the following without issue (patching a crlf file with a crlf patch):

diffy::apply(text_crlf, &p_crlf)
Qyriad commented 2 years ago

Yeah now that you mention it, it would probably be reasonable to leave the mixed line endings case alone, though I do think that leaving the patched file with mixed line endings is an acceptable outcome with that kind of edge case.

The other two should be pretty simple, though, yes. I can take a look into them.