yesodweb / haskell-xss-sanitize

prevent XSS attacks by sanitizing html (this is different then escaping!)
Other
20 stars 15 forks source link

Update MathML lists #9

Open rgaiacs opened 9 years ago

rgaiacs commented 9 years ago

Address #8.

gregwebs commented 9 years ago

Sorry, I wasn't getting notified about this repo. In the README I state that we mirror what is done in html5lib. But at least some of these additions are not there, or here. How do I know that these MathML elements are safe?

gregwebs commented 9 years ago

Another more active project does not have all of these mathML elements.

rgaiacs commented 9 years ago

@gregwebs sorry for the long delay. @fred-wang, could you help me with @gregwebs questions?

fred-wang commented 9 years ago

See https://github.com/html5lib/html5lib-python/pull/181 for html5lib and https://code.google.com/p/feedparser/issues/detail?id=433 for feedparser.

Kuma has more MathML elements in https://github.com/mozilla/kuma/blob/master/kuma/wiki/constants.py

AFAIK, the only MathML features causing problem is "href" (but I assume you are sanitizing the attribute value to remove the Javascript from it). Also, maction@statusline can be used in combination with "href" to hide "evil" links.

gregwebs commented 9 years ago

I see Venus and MDN making MathML changes, but html5lib and feedparser have not taken in changes yet.

Where does MDN display untrusted tags?

gregwebs commented 7 years ago

So it looks like my reference of the html5lib sanitizer is stalled. The list here differs from what Kuma has. FeedParser does have this. Do you know how feedparser gets their tag list? How are these known to be safe?

fred-wang commented 7 years ago

@gregwebs Well, I believed I submitted PRs to FeedParser and HTML5Lib a long time ago. The maintainer of the former project reviewed and accepted the changes. But the maintainer of the latter never got any time to check it so I just gave up...

You can find the exhaustive list in the HTML5 validator schema (mathml3-common.rnc and mathml3-presentation.rnc are the important parts): https://github.com/validator/validator/tree/master/schema/mml3

Most elements are just presentation tags, the only potential problems I know are https://github.com/yesodweb/haskell-xss-sanitize/pull/9#issuecomment-73731539 . Maybe Mozilla Security engineer @jruderman can confirm. Of course, if you do not trust others and really want to be sure you should read the MathML recommendation and understand the tags/attributes :-)

gregwebs commented 7 years ago

Yeah, href is already handled.

The HTML5 validator just tells you what is valid, but it doesn't have anything to do with sanitizing for security? I found another xss lib, but unfortunately it does not have MathML: https://github.com/leizongmin/js-xss/blob/master/lib/default.js

Since I have never used MathML it is really difficult for me to judge the security issues. I really need to see some documentation on the safety of these MathML tag. This could be from

Really, this library has a fundamental issue that there is no longer a quality source of safe tags. Any help in finding that source is greatly appreciated.

The other approach to be taken here is to have an option to allow all MathML tags.

fred-wang commented 7 years ago

The HTML5 validator just tells you what is valid, but it doesn't have anything to do with sanitizing for security?

Well, it tells you the list of MathML tags / elements but as I said to be sure about their safety, one would have to do the analysis. I'm not aware of such public documentation unfortunately.

gregwebs commented 7 years ago

So the best option to maintain security in this library would seem to be to add an option that turns on all MathML