Rohland / htmldiff.net

Html Diff algorithm for .NET
MIT License
288 stars 83 forks source link

<ins class='mod'> without closing tag #20

Closed N0M3AD closed 3 days ago

N0M3AD commented 8 years ago

If i compare texts with special tags like "sup" which was added (or deleted), the differ adds an ins-tag without closig it.

Here is a simple UT, that shows this behavior

        [Test]
        public void WrongResultWithSpecialTag()
        {
            const string oldText = @"<div class=""dumb"">Thiis a text without any sup-tags and other special things</div>";
            const string newText = @"<div class=""dumb"">Thiis a text <sup>1</sup>without any sup-tags and other special things</div>";
            var expectadText = @"<div class=""dumb"">Thiis a text <sup><ins class='diffins'>1</ins></sup>without any sup-tags and other special things</div>";
            Debug.WriteLine("Old text: " + oldText);
            Debug.WriteLine("New text: " + newText);
            Debug.WriteLine("");
            Debug.WriteLine("Expected Diff: " + expectadText);
            var result = new global::HtmlDiff.HtmlDiff(oldText, newText).Build();
            Debug.WriteLine("");
            Debug.WriteLine("Actual Diff: " + result);

            //The Result is "<div class=""dumb"">Thiis a text <sup><ins class='mod'><ins class='diffins'>1</ins></sup>without any sup-tags and other special things</div>"
            Assert.AreEqual(expectadText, result);
        }
GBriotti commented 5 years ago

I have noticed similar behavior. I don't know if it is related to this (wrong) regexp in Uitls.cs:

private static Regex openingTagRegex = new Regex("^\\s*<[^>]+>\\s*$", RegexOptions.Compiled);

that match both opening and closing tag or to the problems of stack...

PhilippRiegelmann commented 2 weeks ago

I believe I have encountered a similar problem when inserting an additional word at the end of a newly formatted text. I have developed a possible solution, set a PR (https://github.com/Rohland/htmldiff.net/pull/61) and in additional tests tried to check as many edge cases as possible that I could think of with this problem. I have also added your problem described above as a test case, although I believe that an opening <ins class=‘mod’> was missing from your expected result.

Rohland commented 3 days ago

Thanks - this is closed by #62 - version 1.5.0 should be available on nuget (if not, shortly after indexing).