Rohland / htmldiff.net

Html Diff algorithm for .NET
MIT License
288 stars 83 forks source link

ConvertHtmlToListOfWords issue. Splitting text into words fitting a block definition #55

Closed pgava closed 1 year ago

pgava commented 1 year ago

There is an issue with the WordSplitter.ConvertHtmlToListOfWords(). The text doesn't split into words corresponding to the group definition. For example, if you use a group like this: "<td>(.*?)</td>" the following text "<td>a</td><td>c</td>" should split into two words: "<td>a</td>", "<td>c</td>"

Rohland commented 1 year ago

Thanks for reporting this @pgava 🙏

It's been a while since I've reviewed this project or made changes 😓. Appreciate the help! I did some refactoring while I was at it and included your fix here #56