I believe that owasp does balancing and reformatting HTML code to some extent before its sanitization and there could be potential XSS Vectors that arise due to malformed html. I understand, browser interpret/parse these HTML differently and it is ideally expected that HTML is as per HTML Specifications.
In a third party-controlled case, I see the following HTML being rendered correctly by browser (as different row), but once it goes through HTML Sanitizer it is being rendered as a column outside the table.
1) I see there are listeners for removal of tags/attributes. Is there something for HTML rewrite where we can handle to skip sanitization and return text as is without rewriting or do something as required in the listener?
2) May I understand, why this is being treated differently in browsers and sanitizer, is it due to the parser? Any suggestions would be helpful.
I believe that owasp does balancing and reformatting HTML code to some extent before its sanitization and there could be potential XSS Vectors that arise due to malformed html. I understand, browser interpret/parse these HTML differently and it is ideally expected that HTML is as per HTML Specifications.
In a third party-controlled case, I see the following HTML being rendered correctly by browser (as different row), but once it goes through HTML Sanitizer it is being rendered as a column outside the table.
1) I see there are listeners for removal of tags/attributes. Is there something for HTML rewrite where we can handle to skip sanitization and return text as is without rewriting or do something as required in the listener?
2) May I understand, why this is being treated differently in browsers and sanitizer, is it due to the parser? Any suggestions would be helpful.
htmlContent.txt