apostrophecms / sanitize-html

Clean up user-submitted HTML, preserving whitelisted elements and whitelisted attributes on a per-element basis. Built on htmlparser2 for speed and tolerance
MIT License
3.68k stars 349 forks source link

Starting with v1.18.2 invalid style value remove entire style tag #576

Closed mattweberio closed 1 year ago

mattweberio commented 1 year ago

Question or comment

Starting with the v2 invalid style values, remove the entire style tag.

We used to be able to do the following:

<p style="color: {{peachColor}};">Test</p>

But this now results in the following:

<p>Test</p>

In our use case, it's okay to use placeholders because they're replaced before publishing.

Is there a way to allow all values within a tag, specifically the style tag?

Details

Version of Node.js: 14.17.0

Server Operating System: Linux

Additional context: We use this config: allowedAttributes: { '*': ['style', 'id', 'class', 'data-*', 'title'], }

We expected the values inside these tags would not be removed, and we could allow anything.

BoDonkey commented 1 year ago

What version did you upgrade from? Version 1.18.2 only adds some testing. Version 1.18.0 did make some changes to allowedAttributes.

boutell commented 1 year ago

Can you create a PR with a failing unit test demonstrating the problem?

boutell commented 1 year ago

You must use allowedAttributes to allow style first, then you must use allowedStyles to specify what can be done within that attribute.

mattweberio commented 1 year ago

Hi @BoDonkey - We were on 1.15.0, which is, of course, super old. It was a project that did its job for a long time. There may be versions in between without tags, so it could be another version that introduced the change. We went back to 1.15.0 for now.

Hi @boutell - Thanks for the reply. We use the allowedAttributes code in the context above. However, allowedStyles doesn't allow us to accept all styles, as far as I can tell. It's not realistic for us to try and list every possible style one by one, but more to the point, the style's value may be a placeholder. Also, we don't want to validate the "value" at all, only the tags and attributes.

Is it possible to allow the style's value validation to be disabled for those that only want to sanitize tags or use placeholders? Note: Some people also had a use case for this in another thread because they use this library in a browser and it throws and exception and removes their style attributes.

boutell commented 1 year ago

Actually according to the source it looks like "allowedStyles: false" in combination with allowing the style attribute ought to allow you to do whatever you want in the style attribute, are you sure that is not the case?

mattweberio commented 1 year ago

Hi @boutell

Actually according to the source it looks like "allowedStyles: false" in combination with allowing the style attribute ought to allow you to do whatever you want in the style attribute, are you sure that is not the case?

This sounds perfect for us, but in 2.7.2 it doesn't appear to be the case. We've tried both allowedStyles: false and allowedStyles: true.

mattweberio commented 1 year ago

Hi @boutell - If I dig into this and make a PR for it to work as you described, would it be supported? i.e., I don't want to do the work if there's no interest in this behavior.

boutell commented 1 year ago

In fairness to you Matt I think it would be best if we took a good look at it ourselves and thought through whether it would create anything that could reasonable be considered a backwards compatibility break first. I'm also curious why it doesn't already work, as my initial reading of the code suggests it ought to.

On Tue, Oct 11, 2022 at 1:31 PM mattweberio @.***> wrote:

Hi @boutell https://github.com/boutell - If I dig into this and make a PR for it to work as you described, would it be supported? i.e., I don't want to do the work if there's no interest in this behavior.

— Reply to this email directly, view it on GitHub https://github.com/apostrophecms/sanitize-html/issues/576#issuecomment-1275043925, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH27NXPB3G67ETU2AUST3WCWQA7ANCNFSM6AAAAAARBVICBA . You are receiving this because you were mentioned.Message ID: @.***>

--

THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his

boutell commented 1 year ago

@mattweberio I've done the analysis. Yes, we'd accept a PR for this. Specifically, explicitly setting allowedStyles to false (not just any falsy value) would disable the style verification, and so if style was included in allowedAttributes then any style would be permitted.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.