Was Sanitized Flag - Githubissues

A single policy p can be used to sanitize many pieces of user supplied content, so having a boolean on the policy does not make sense as it would always return true after the first call of p.Sanitize(html).

It would be better to track this state in your application, alongside the content that you are sanitizing. This is the recommendation.

As a guide, the way I approach this is that I store both the raw (unsanitized) input in my database and the sanitized output. I have a NOT NULL column for the raw input, and a NULLable column for html. I INSERT into raw, and only when I need to return the sanitized content to a client (API or web) I check the html column and if it's still NULL I read raw, sanitize it, and insert it into html and then return the sanitized string to the client.

I like this approach, as if a vulnerability is discovered with bluemonday, I can simple update my table to NULL my html column and it will re-sanitize everything again, and it uses the html column as a cache to prevent unnecessary CPU cost in calling p.Sanitize() for something I've sanitized in the past.

I'm doing the work in my application to know what has been sanitized or not... lazily.

microcosm-cc / bluemonday

Was Sanitized Flag #59