Closed mjbvz closed 1 month ago
I think the property you want for checking comments is nodeName
, as in data.nodeName === '#comment'
, since HTML comment nodes don't actually expose a tagName
property. It's worth noting though that in my quick testing to reproduce this, comments are filtered without specifying removal from the default DOMPurify settings. They're removed from both of the dirty string examples in my attempt to repro:
<!doctype html>
<html>
<head>
<script src="./dist/purify.js"></script>
</head>
<body>
<h1>Sanitized Content:</h1>
<div id="sanitized"></div>
<script>
"use strict";
const dirty1 =
"<!-- BIG COMMENT ENERGY --><p>HELLO<iframe/\/src=JavScript:alert(1)><!-- comment 2 --></ifrAMe><br>goodbye</p>";
const dirty2 = `<p> <!-- comment1 <div></div> --> <b>BOLD</b> </p>`;
DOMPurify.addHook("uponSanitizeElement", (node, data) => {
if (data.nodeName === "#comment") {
node.parentNode.removeChild(node);
} else if (data.tagName === "b") {
node.parentNode.removeChild(node);
}
});
const clean1 = DOMPurify.sanitize(dirty1);
const clean2 = DOMPurify.sanitize(dirty2);
console.log("dirty1: ", dirty1);
console.log("clean1: ", clean1);
console.log("dirty2: ", dirty2);
console.log("clean2: ", clean2);
const sanitized = document.getElementById("sanitized");
sanitized.innerHTML = `
<h1>Sanitized 1</h1>
${clean1}
<h1>Sanitized 2</h1>
${clean2}
`;
</script>
</body>
</html>
The above example logs the following sanitized HTML template strings, with or without the fixed conditional checking for data.nodeName === "#comment"
:
dirty1: <!-- BIG COMMENT ENERGY --><p>HELLO<iframe//src=JavScript:alert(1)><!-- comment 2 --></ifrAMe><br>goodbye</p>
clean1: <p>HELLO<br>goodbye</p>
dirty2: <p> <!-- comment1 <div></div> --> <b>BOLD</b> </p>
clean2: <p> </p>
Thanks for taking a look!
comments are filtered without specifying removal from the default DOMPurify settings
Yes sorry I forgot to say that my example code is just the minimal code that reproduces the issue. What we're actually trying to do is replace any removed elements with text representations, so for example comment nodes become text nodes containing the text of the comment. Something like this:
DOMPurify.addHook('uponSanitizeElement', (node, data) => {
// Make comments visible by replacing them with text nodes
if (data.tagName === '#comment') {
const textNode = document.createTextNode('<!--' + node.data + `-->`);
node.replaceWith(textNode)
}
});
This was working fine in 3.1.5 and older, but seems to have been broken by 6e03334bab86a8af93cece701ed21bc9ca37eb8e
Heya, thanks for the report - sadly the way we used remove()
before was not 100% safe so we needed to change that.
I would recommend to address the bug in your hook, as we cannot do much in the core library.
Thanks. I ended up working around this by inserting a new text node with the comment content, and then letting dompurify strip out the actual comment node as normal
I don't know enough to say for sure this is safe/correct, but I think this could be fixed by checking if the parent node is defined before calling removeChild
on it: getParentNode(node)?.removeChild(node);
That way Element.prototype.remove
would not be called in catch
Good question indeed :D Here is the attack vector by the way - wanna give it a try?
hello<form><input name=nodeName><input name=parentNode><input name=remove>
Hmmmmm.
On second thought and closer analysis, it's likely that using optional chaining will cause a security bug. Given you already have a solution that works for you, I would say let's better not touch the core :slightly_smiling_face:
With 3.1.7, I'm trying to replace / remove HTML comment nodes in the
uponSanitizeElement
hook. Here's a minimal example:This causes the following exception in Chrome, Safari, and Firefox:
It seems like
Element.prototype.remove
is being invoked on a element that isn't considered anElement
If you remove the
data.tagName === '#comment'
conditional, the code works fine removing/replacing<b>
elements. It only seems to happen for comments