HtmlUnit / htmlunit-neko

HtmlUnit adaptation of NekoHtml
Apache License 2.0
17 stars 15 forks source link

Selfclosing Break Line Tag <br /> tag in html content is converted into <br> open tag. #122

Closed BloodDrag0n closed 3 months ago

BloodDrag0n commented 3 months ago

I am using antisamy to sanitize HTML contents. During the parsing of the HTML Data, the
(self-closing) tag is converted into
(open tag). Is there any specific reason behind this behavior change? Is there any way to retain the
tags?

Eg Input Html Data: <p>this is para data</p><br/><p>this is para data 2</p>

Eg Output Html Data: <p>this is para data</p><br><p>this is para data 2</p>

HtmlUnit-Neko version using - 3.11.2 Antisamy version using - 1.7.5

rbri commented 3 months ago

hi @BloodDrag0n, looks like a bug, will have a look

rbri commented 3 months ago

@BloodDrag0n have added a test case but for me this looks ok from a first look. Any idea?

rbri commented 3 months ago

looks like the root cause is in antisamy - not in neko; see https://github.com/nahsra/antisamy/issues/484