cure53 / DOMPurify

DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:
https://cure53.de/purify
Other
13.65k stars 698 forks source link

Sanitize MIME message #25

Closed toberndo closed 10 years ago

toberndo commented 10 years ago

I'm trying to sanitze a MIME message:

MIME-Version: 1.0
Sender: xxx@xxx.xx
Received: by 10.60.155.104 with HTTP; Mon, 5 May 2014 06:08:54 -0700 (PDT)
Date: Mon, 5 May 2014 15:08:54 +0200
Delivered-To: xxx@xxx.xx
Message-ID: <CAMQ7_A76jtf9Q1bx=G_K9miTdy0C2LPo7xuECNmGpDzmQPcZ4A@mail.gmail.com>
Subject: Hello
From: =?UTF-8?Q?xxx?= <xxx@xxx.com>
To: =?UTF-8?Q?xxx?= <xxx@xxx.com>
Content-Type: multipart/alternative; boundary=089e0115e7e4537fcd04f8a6d552

--089e0115e7e4537fcd04f8a6d552
Content-Type: text/plain; charset=UTF-8

Test

--089e0115e7e4537fcd04f8a6d552
Content-Type: text/html; charset=UTF-8

<div dir="ltr">Test</div>

--089e0115e7e4537fcd04f8a6d552--

and get with default 0.3 the following output:

MIME-Version: 1.0
Sender: xxx@xxx.xx
Received: by 10.60.155.104 with HTTP; Mon, 5 May 2014 06:08:54 -0700 (PDT)
Date: Mon, 5 May 2014 15:08:54 +0200
Delivered-To: xxx@xxx.xx
Message-ID: 

Is it intended that everything after a <xxx@xxx.xxx> is truncated? And if yes, is there a setting to prevent that?

Thanks.

ohpe commented 10 years ago

Technically, everything after the Message-ID is truncated because starts with <CAMQ7_A76jtf9Q1bx=G_K9miTdy0C2LPo7xuECNmGpDzmQPcZ4A@mail.gmail.com> that is analyzed as a tag, and it is not allowed. Anyway, this is a DOM-only sanitizer for HTML, MathML and SVG. You should analyze only the Content-Type header field.

toberndo commented 10 years ago

Ok, makes sense. That means I would parse the message first and send the text/html part to DOMPurify. Thanks