Closed dgilman closed 1 year ago
As a workaround, I can confirm that instantiating your own Cleaner instance, passing the LinkifyFilter to that, and whitelisting the relevant AMP tags with that cleaner gets you around the issue.
Cool, I'm glad you found a workaround and appreciate you digging up the regression!
This comment from the PR that introduced the regression looks relevant:
This builds a list of all known HTML tags that html5lib knows about.
I thought about making this an argument, but that requires some new naming and tests and documenting work. I currently think if a user needs to specify tags, then they should be sanitizing and then linkifying.
https://github.com/mozilla/bleach/pull/394#discussion_r221985246
If you have time to work on a PR that:
HTML_TAGS
valueHTML_TAGS
in bleach/html5lib_shim.py
I think that'd work.
We're not going to fix this and there's a valid way to get the desired behavior.
Describe the bug
The fix committed in #394 breaks custom HTML5 elements, notably Google's AMP. I don't think your whitelisting strategy is valid in the face of HTML 5 as custom components are gaining more popularity and are part of the standard.
python and bleach versions (please complete the following information):
Bleach 3 - regression is commit bea164a3a7e6ce2dfb826130e8faa6f447680dc1
To Reproduce
Steps to reproduce the behavior:
Expected behavior
No assertion
Additional context
If you have feedback on how this can be fixed I will do a PR, although not this weekend. If it is possible to just hack around this by shoving our own valid list of HTML elements in somewhere I can do that too, and maybe do a documentation PR?