readthedocs / ext-theme

Read the Docs drop in replacement site templates
2 stars 2 forks source link

Notifications: use sanitize-html for cleaning header/body content #322

Open agjohnson opened 5 months ago

agjohnson commented 5 months ago

This was a library that was brought in for displaying ANSI color codes in the build detail command output. It is currently unused, but was loaded async previously. It seems we could use this for notifications as well, making it a standard part of our bundled JS. Or, alternatively, load it async when there are notifications.

We only need to support some minor elements in notification body/header, mostly formatting and links.

https://github.com/readthedocs/ext-theme/blob/bd12f154ae54d3d0c92bc37d4415a661b75552ef/src/js/build/detail.js#L153-L179

stsewd commented 5 months ago

Another popular option is DOMPurify https://github.com/cure53/DOMPurify

agjohnson commented 3 months ago

@stsewd What sort of scenario would this help cover? I know we talked about cleaning the notificaiton HTML before DOM insertion, but if the notification HTML is safe inside our API, is there any case where this wouldn't be safe as we are injecting it into the DOM?

stsewd commented 3 months ago

As long as our notifications don't include user content withotu escaping, we shuold be good.

agjohnson commented 3 months ago

@humitos are we consistently escaping user input in notifications through the API? I couldn't actually find where we were sanitizing message body/header content. I thought this happened at the model or API level

humitos commented 3 months ago

We are not rendering user-input in general. There could be some specific/deprecated cases where get a value from a command's output --but it may not be there anymore, probably.

We are using Django Templates to render them, so they may be already sanitized. If we really want to do extra sanitization here, we could add some extra code at rendering time: https://github.com/readthedocs/readthedocs.org/blob/276590c43bc5f431715643ffa164e284d4433052/readthedocs/notifications/messages.py#L72-L78

Probably this is a non issue, tho.

agjohnson commented 3 months ago

Yeah, I feel like maybe we aren't sure about some of the old notifications, and that is probably the only place we're likely to encounter dirty input.

I think an explicit step at rendering makes sense, or we use the JS library to do this on the fly, either works. It does feel like we should ensure no user goes into the DOM unescaped though.

agjohnson commented 2 months ago

It sounds like we aren't 100% certain here, so I am going to lean towards @stsewd suggestion and say we should implement sanitize-html in our code. The addition should be really minor and there is already an example of how to use this too. I don't think we need to worry about async loading of the library anymore though, as we will need this on every view essentially.

I'm going to consider this medium priority work for our beta phase.

stsewd commented 2 months ago

To be fair, if we are allowing user content as the notification text (not to be confused as user content as variables), we have a bigger problem, since they can evaluate template code (can lead to expose sensitive information or run arbitrary code).

agjohnson commented 2 months ago

I believe this should never be the case, I can't think of why we would need to give the user this power. We might be using user data in the template context though, which this helps prevent.