Open PriyankPurwar opened 5 years ago
Same question then as https://github.com/OWASP/java-html-sanitizer/issues/143#issuecomment-392858011
How is this a problem?
The HTML 😞
should be equivalent to 😞.
I'll second this issue. Some apps save data that's used across multiple output channels (html website AND a mobile app for example) so Unicode would work fine in both but HTML entities would not work in an app using native controls and not a webview.
@alecl, this library outputs HTML. How are its output conventions relevant to apps that use native controls?
I think of it as a gold standard for sanitizing HTML not necessarily transforming existing data even if to HTML compatible formats.
It's not that esoteric a use case to have one database entry for source data for display in multiple channels (web, mobile app, e-mail even).
Another option for us would be to use an HTML stripping tool but those are often naive removing brackets with impunity or doing other odd things. This tool is a much smarter implementation.
Is there a way to skip the sanitization of emojis.
This was the old issue (https://github.com/OWASP/java-html-sanitizer/issues/143 )but I don't see any reasonable conclusion