cure53 / DOMPurify

DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:
https://cure53.de/purify
Other
13.67k stars 698 forks source link

[DRAFT] Support happy-dom #878

Closed luxaritas closed 8 months ago

luxaritas commented 10 months ago

Summary

This PR supports happy-dom in addition to jsdom as a valid DOM library for NodeJS

Resolves #876

Background & Context

happy-dom is an alternative to JSDom with a focus on performance. Minimally, tests using happy-dom need to be added. While both aim to be DOM-compatible, there are some subtle differences in implementation that need to be addressed (or potentially upstream errors that need to be resolved).

Tasks

luxaritas commented 10 months ago

Marking this as not a draft PR so that we can get the CI workflows running

luxaritas commented 10 months ago

There's a lot of noise here from the "fallback value for..." warnings, which appears to be due to in happy-dom parentNode being a property (not a getter or function), causing lookupGetter to fail. In a similar vein, happy-dom implements overrides for cloneNode in subclasses (including a separate implementation for Text which is not an Element at all), which means the cloneNode calls will misbehave since it is using the version on Element's prototype.

I think this needs to be addressed first, since this is impacting so many test cases.

This seems to have been introduced by @securityMB as part of #495 to harden against DOM clobbering. Suggestions for how to handle this?

cure53 commented 10 months ago

My thinking is that happy-dom will likely have to fix that, no? The fix back then was too important to tamper with :sweat_smile:

luxaritas commented 10 months ago

Wasn't sure if there was some other way around it that wouldn't require structural implementation changes upstream, but I feared that may be the case.

I am a bit confused about how DOM clobbering is relevant to DOMPurify in this case though - DOMParser/createDocument only construct "pure" nodes unaffected by changing (in this case) window.Element, right? In which case the properties of the node being used should be able to be trusted. What am I missing? Or is this a proactive protection against DOM implementation bugs (or lack of it being ensured by specifications, if thats relevant)?

cure53 commented 10 months ago

Unfortunately, no :smile: Properties like parentNode and childNodes can very easily be clobbered, the protection is necessary here, as far as I can see.

DOMParser/createDocument only construct "pure" nodes unaffected by changing (in this case) window.Element, right?

Sadly no :sweat_smile:

luxaritas commented 10 months ago

If you have a reference to that behavior happening, I'd be interested to see it - testing on my own I was not able to reproduce (eg, globalThis.Element.__proto__ = null; globalThis.Element = null; window.Element.__proto__ = null; window.Element = null; new DOMParser... still works).

Before I file an issue to happy-dom about the overridden functions/getters, there is one thing I want to check about with respect to parentNode. Given it is a property, wouldn't any clobbering protections be useless against happy-dom since even if it was a getter, there would still be a property it would be pulling from that could be clobbered? Assuming that's the case, I imagine that in order to be truly robust, we'd need to ask happy-dom to change so that all properties are defined by symbols and then accessed via getters (like jsdom does via symbol-tree)? Otherwise we may as well have lookupGetter define its own inline getter in the case the Element/Node prototype defines neither a getter or function.

cure53 commented 10 months ago

I think there might be a confusion between DOM Clobbering and Prototype Pollution.

The risky case we try to harden against is this, if I am not mistaken:

// clobbered
<form id="foo"><input name="parentNode"></form>
<script>console.log(foo.parentNode)</script>

// not clobbered
<form id="bar"><input></form>
<script>console.log(bar.parentNode)</script>

In the mXSS prevention where we look at the namespace of the parent element, we need to be able to trust that we have the actual parent element in our hands, not the clobbering node which would lead to bypassing the check.

cure53 commented 10 months ago

Just to make sure that we are talking about the same issue here... what exact part in the code is problematic? For a moment I thought it's the slightly weird way of how we get access to the correct parent node. But now I am not sure anymore.

I am correct to assume that the code that results non-clobbered values is the problem, no? :smile:

<form id="foo"><input name="parentNode"></form>
<script>
console.log(foo.parentNode) // clobbered
console.log(Element.prototype.__lookupGetter__('parentNode').call(foo)); // not clobbered
console.log(Object.getOwnPropertyDescriptor(Object.getPrototypeOf(Element.prototype), 'parentNode').get.call(foo)); // not clobbered
</script>  
luxaritas commented 10 months ago

In the mXSS prevention where we look at the namespace of the parent element, we need to be able to trust that we have the actual parent element in our hands, not the clobbering node which would lead to bypassing the check.

Ah! Missed that a child element can clobber properties on the parent element, not just window/document.

Just to make sure that we are talking about the same issue here... what exact part in the code is problematic?

It's the usage of __lookupGetter__. To show the current situation, let's look at a repl:

> class Node {parentNode = 5}
undefined
> class Element extends Node {}
undefined
> el = new Element()
Element { parentNode: 5 }
> Object.getOwnPropertyDescriptors(Element)
{
  length: { value: 0, writable: false, enumerable: false, configurable: true },
  name: {
    value: 'Element',
    writable: false,
    enumerable: false,
    configurable: true
  },
  prototype: {
    value: Node {},
    writable: false,
    enumerable: false,
    configurable: false
  }
}
> Object.getOwnPropertyDescriptors(Node)
{
  length: { value: 0, writable: false, enumerable: false, configurable: true },
  name: {
    value: 'Node',
    writable: false,
    enumerable: false,
    configurable: true
  },
  prototype: {
    value: {},
    writable: false,
    enumerable: false,
    configurable: false
  }
}
> Object.getOwnPropertyDescriptors(el)
{
  parentNode: { value: 5, writable: true, enumerable: true, configurable: true }
}

Because of the way parentNode is declared, it is not on the prototype at all.

nb: It appears that happy-dom implements some anti-clobbering internally, though it's not consistent.

I think I have enough information to file an issue with them now - thank you for your patience and explanations.

cure53 commented 10 months ago

I think I have enough information to file an issue with them now - thank you for your patience and explanations.

Awesome, most welcome and I say thanks for your efforts :bow:

cure53 commented 8 months ago

Closing this for now, it seems that the ball is in happy-dom's corner and the project is inactive. Please reopen if anything changes.

capricorn86 commented 5 months ago

Hi! :wave:

I'm the author of Happy DOM.

I believe that these issues has mostly been sorted out now.

Properties are now getters and setters, which was fixed in v13.1.0.

Sub-classes no longer overrides cloneNode(), appendChild(), removeChild(), insertBefore(), replaceChild(), which was fixed in v14.7.1.

Let me know if some property or method was missed, and it should be quite easy to fix it in that case.

cure53 commented 5 months ago

Whoa :wave: Awesome, nice to meet you :slightly_smiling_face:

We'll have a look soon!

cure53 commented 5 months ago

@capricorn86 Someone recently reported a bypass when using latest DOMPurify with happy-dom, should we chat via email and see if we can resolve it or if it still exists?

capricorn86 commented 5 months ago

@capricorn86 Someone recently reported a bypass when using latest DOMPurify with happy-dom, should we chat via email and see if we can resolve it or if it still exists?

@cure53 yes sure. When would be a good for a chat for you? I'm in Swedish timezone.

cure53 commented 5 months ago

Awesome, how about a quick meeting tomorrow 2pm CEST / GMT+2? Happy to shoot over an invite :slightly_smiling_face:

capricorn86 commented 5 months ago

I have another meeting at 2pm (until 3:30pm). The rest of the day i'm available 🙂