Closed willkg closed 2 years ago
@jvanasco @stucox @FelixSchwarz @kiilerix @Alex3917 @tony @EmilStenstrom @t-ionut and anyone else following along with sanitize css issues in Bleach, does this work for you? What else should we be thinking about and testing?
Thanks @willkg, this looks great to me! It seems to be ~5% faster also. The only difference I notice over the ~400 tests I have for this is just the spaces being collapsed after colons, so e.g. 'font-space: 700' becoming 'font-space:700'. But that's fine with me!
... @kiilerix ... does this work for you? What else should we be thinking about and testing?
What is the goal of the CSS sanitization? I don't see anything in goals.rst that justifies it. If we trust browsers to handle garbage "safely" and don't care about "looks", what is left? And without an explicit statement of the goal, it is hard to comment on whether this change achieves it or not.
The problems I saw are fixed already by removing the previous unnecessarily strict and arbitrary CSS validation / sanitization.
I'm not sure about the use of tinycss2. While it presumably is "safe" for parsing arbitrary CSS, it might not provide the expected safety properties when used in this way. It is extra complexity and dependency. Does it really add any kind of actual security that justifies it?
I think I'm +1 for dropping the old code and -0 for introducing tinycss2.
@kiilerix The issue with CSS is more that if a user can change the CSS for another user, then they can trick that user into doing dangerous things. E.g. if there is a button that says "Delete Account", and a malicious user changes it to say "Get Free Sample!" or something. Here is an article about this that was on Hacker News this week:
I used tinycss2 in the past without problems (though I did not really do complicated stuff using the library). From a Fedora perspective we have v1.1+ in all supported Fedora releases and tinycss2 1.0.x in EPEL 8.
Our main application is a Content Management System. On a given "property" there are two general types of users: "admin" and "public".
We completely block inline CSS with bleach for security and UX concerns for all users in all contexts. I could not imagine a way to allow it in user-generated content that did not introduce issues.
We allow "admins" to upload custom templates. Those are cleaned by bleach and allow class
attributes for CSS customization, and only injected into the public facing content templates – for security concerns there is a strict separation in where those can be injected, so it can not affect anything touching account or property management. Those are parsed by another library, I think it is tinycss. We created an overly complex way of implementing and restricting the CSS overrides with the template structure and load order, I don't recall the specifics offhand.
@Alex3917 Additional concerns with CSS include breaking UX, disrupting comment systems, and sneaking in tracking pixels or images via "url". The latter the same vector as described for data exfiltration in the link you shared, but something I saw attempted often by bots and scammers when I ran a large website with a highly active comment system.
^^^ That improves things a bit.
css
extras which will install tinycss2. Thus tinycss2 is now an optional dependency. It specifies an upper and lower bound for acceptable versions.I spent some time looking at adding support for sanitizing <style>
content. I think I'm going to push that off.
I still need to figure out AtRule handling.
I'm going to punt on the AtRule handling. It seems nice for completeness, but I don't have any good ideas on how to do it and if someone needs it, they can write their own CSSSanitizer since they probably have other needs as well. Given that, I think I'm done working on this and it's ready for review.
@g-k Thank you for helping me through that!
This switches to using tinycss2 for parsing CSS rules and serializing CSS tokens after sanitizing.
Things to think about:
clean
andCleaner
APIs to make it easier for users to use their own CSS sanitizers?<style>
tags?