OWASP / java-html-sanitizer

Takes third-party HTML and produces HTML that is safe to embed in your web application. Fast and easy to configure.
Other
835 stars 209 forks source link

Extra characters got added during sanitization of html #262

Open arpitbansal1581 opened 2 years ago

arpitbansal1581 commented 2 years ago

Hi,

We are using this owasp-java-html-sanitizer-20211018.2.jar library for sanitization of the custom generated HTML, we came across the following situation when we got extra characters in html code as during sanitization.

Input String- {1:F21TEMPBIC}{4:{177:2203031005}{451:0}}{{311:ACK}{108:MA33A03110SZ0TFC}} Output String- {1:F21TEMPBIC}{4:{177:2203031005}{451:0}}{<!-- --> {311:ACK}{108:MA33A03110SZ0TFC}}

It will be great if someone can guide me on how to handle this situation or it can be considered as an enhancement or bugfix.

subbudvk commented 8 months ago

@arpitbansal1581 This is expected, that is {{ getting converted into {<!-- -->{ to avoid XSS arising due to templates.

Also this library is best suited to sanitize HTML Strings.

https://github.com/OWASP/java-html-sanitizer/blob/master/docs/client-side-templates.md