DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:
This issue is generally seeking more information, not necessarily highlighting a bug or proposing a new feature.
Background & Context
When sanitizing input and allowing both HTML and MathML (USE_PROFILES: { mathMl: true, html: true }), it seems that MathML Content Markup is being removed entirely. MathML Presentation Markup behavior is as expected.
It's not a bug, since these tags are being disallowed on purpose here.
The library has flexibility to add these back in if necessary:
const clean = DOMPurify.sanitize(dirty, {ADD_TAGS: ['my-tag']});
I'm curious to learn what the reason is for disallowing these in the first place. Any info would be appreciated! If there are common security risks associated with these, then I wouldn't want to allow them. If there isn't much of a security risk, then I could add them back in, or maybe they could not be disallowed at the library level.
Returns only the <math> element enclosing the first <mrow> child and its children. <semantics> and <annotation-xml> (along with its children) are removed.
Expected output
I expected that it would do this since it's purposefully disallowed. Just curious to learn more about the security risks behind these MathML Content Markup elements.
Feature
If there are no security risks, maybe these MathML Content Markup elements could be allowed at the library level. I'm not expecting this to be the case necessarily, since they've been purposefully disallowed.
Background & Context
When sanitizing input and allowing both HTML and MathML (
USE_PROFILES: { mathMl: true, html: true }
), it seems that MathML Content Markup is being removed entirely. MathML Presentation Markup behavior is as expected.It's not a bug, since these tags are being disallowed on purpose here.
The library has flexibility to add these back in if necessary:
const clean = DOMPurify.sanitize(dirty, {ADD_TAGS: ['my-tag']});
I'm curious to learn what the reason is for disallowing these in the first place. Any info would be appreciated! If there are common security risks associated with these, then I wouldn't want to allow them. If there isn't much of a security risk, then I could add them back in, or maybe they could not be disallowed at the library level.
Input
Mixed Markup Examples
Given output
Returns only the
<math>
element enclosing the first<mrow>
child and its children.<semantics>
and<annotation-xml>
(along with its children) are removed.Expected output
I expected that it would do this since it's purposefully disallowed. Just curious to learn more about the security risks behind these MathML Content Markup elements.
Feature
If there are no security risks, maybe these MathML Content Markup elements could be allowed at the library level. I'm not expecting this to be the case necessarily, since they've been purposefully disallowed.