Open Ghyath-Serhal opened 8 months ago
HtmlSanitizer is only intended to sanitize HTML. When a fragment is passed to the Sanitize()
method, it is wrapped in a body before it is parsed by AngleSharp's HTML parser. The additional body tag in the fragment is then dropped by the parser. I currently don't see a way around this. https://github.com/mganss/HtmlSanitizer/blob/28bdf0e0a1a143735a6be7858a38eaea772fcfef/src/HtmlSanitizer/HtmlSanitizer.cs#L386
You can try and experiment with the SanitizeDom()
overload that takes an IHtmlDocument
. You'd need to coerce AngleSharp into keeping the body element somehow.
In theory, you could also work with the AngleSharp.Xml package but the problem is that HtmlSanitizer makes extensive use of AngleSharp's IHtmlDocument
interface so it would probably be hard to add support for XML.
I'm interested to hear what your use case is. Where's the XSS vector in your scenario?
I am using HtmlSanitizer to sanitize the below xml data, that contain a body tag.
I have added the tag1, tag2, tag3 and body to the AllowedTags attribute. I am getting the below result. As you can see the body tag is removed. I am just getting the data in the body tag.