mozilla / bleach

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes
https://bleach.readthedocs.io/en/latest/
Other
2.65k stars 253 forks source link

fix stripping block-level tags (#369) #651

Closed willkg closed 2 years ago

willkg commented 2 years ago

When Bleach strips a block-level element, it should replace it with a newline preserving the whitespace that would exist if it was being parsed by a browser.

Fixes #369.

This supersedes PR #642.

willkg commented 2 years ago

@g-k Can you eyeball this? I'm pretty sure it's fine.

Alex3917 commented 2 years ago

Does it make sense to make this optional? (Other than via monkey patching HTML_TAGS_BLOCK_LEVEL.)

willkg commented 2 years ago

Does it make sense to make this optional? (Other than via monkey patching HTML_TAGS_BLOCK_LEVEL.)

I'm going to evaluate "does it make sense" in terms of "users have expressed a need" and so far I haven't seen anyone express a need to make this optional. In my uses of Bleach, having the additional newlines would have been fine. Given that, I think the answer at this time should be no. If someone has a need, they can write up an issue and we can evaluate things from there.