mganss / HtmlSanitizer

Cleans HTML to avoid XSS attacks
MIT License
1.51k stars 198 forks source link

MSO conditional comments #525

Closed stianolsen closed 5 months ago

stianolsen commented 5 months ago

In one of the later releases, there was a change which made some characters like < and >, be encoded if they were found within a HTML comment. We experience that this breaks MSO conditional comments, which uses HTML comment to "hide" the Outlook specific parts. For example this:

<h1>Hello</h1><!-- Normal comment --><!--[if mso]> 
<table><tr><td>
       <p>This information will display only in Microsoft Outlook.</p>
   </td></tr></table>
<![endif]-->

becomes the following after being sanitized:

<h1>Hello</h1><!-- Normal comment --><!--[if mso]&gt; 
&lt;table&gt;&lt;tr&gt;&lt;td&gt;
       &lt;p&gt;This information will display only in Microsoft Outlook.&lt;/p&gt;
   &lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;![endif]-->

and Outlook does not seem to understand when the HTML within the conditional comment is encoded. Is there anything that can be done with this?

mganss commented 5 months ago

I've created a property EncodeComment to customize the encoding of comments similar to what has been done for #511. Watch out for possible bypasses if you override the default behavior (see https://github.com/mganss/HtmlSanitizer/security/advisories/GHSA-43cp-6p3q-2pc4).

mganss commented 5 months ago

Releases are 8.0.838 and 8.1.839-beta.