mozilla / bleach

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes
https://bleach.readthedocs.io/en/latest/
Other
2.65k stars 251 forks source link

RFE: move away from deprecated `html5lib` #729

Closed kloczek closed 7 months ago

kloczek commented 7 months ago

Is your feature request related to a problem? Please describe. It would be nice tu cut tail of some legacy modules decencies. One of those modules is html5lib.

Describe the solution you'd like it wold be good to remove use od=f the html5lib deprecated html5lib like it has been done with pip ~2 years ago. https://github.com/pypa/pip/pull/11259

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context html5lib depends on six which is on list of deprecated modules even longer implanting this RFE would make easier kill two birds using one stone 😋

willkg commented 7 months ago

Bleach is deprecated and in a maintenance-only mode. See #698. Given that, I'm going to close this out.

Having said that, pip and Bleach used html5lib in radically different ways. I don't believe it's possible to switch Bleach over to use html.parser. A change like that would probably involve a fundamental rewrite of Bleach. If someone was interested in pursuing that, they should do it as a new library.