mozilla / bleach

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes
https://bleach.readthedocs.io/en/latest/
Other
2.65k stars 251 forks source link

bleach breaks because of html5lib #212

Closed matthiaskubik closed 8 years ago

matthiaskubik commented 8 years ago

Since a few hours, bleach is broken because of a new version of html5lib:

File "//endpoint/rest_endpoint_handler.py", line 21, in import bleach File "/usr/local/lib/python2.7/site-packages/bleach-1.4.2-py2.7.egg/bleach/init.py", line 8, in from html5lib.sanitizer import HTMLSanitizer ImportError: No module named sanitizer

Fix would be documented here https://github.com/html5lib/html5lib-python/issues/277

sergio97 commented 8 years ago

Looks like html5lib move/removed 'sanitizer'. It seems this was intentional.

generalov commented 8 years ago

0.99999999/1.0b9 Released on July 14, 2016

Get rid of the sanitizer package. Merge sanitizer.sanitize into the sanitizer.htmlsanitizer module and move that to saniziter. This means anyone who used sanitizer.sanitize or sanitizer.HTMLSanitizer needs no code changes.

Unfortunally, bleach is using from html5lib.sanitizer import HTMLSanitizer.

jonathanmorgan commented 8 years ago

Any idea of time frame on figuring out a fix? If it is a big change and will take time, I understand, just want to get an idea.

matthiaskubik commented 8 years ago

I already opened a pull request to fix the requirements.txt file, but no respose or action yet. At the end, the requirements.txt has the wrong version: It has the one with 8 nines, where it should have the one with one 7. All we need is someone accepting the pull request as a quick workaround.

willkg commented 8 years ago

I can't reproduce this issue. Here's my attempt:

$ mkvirtualenv bleachtest
<testing> $ pip install bleach
Collecting bleach
  Downloading bleach-1.4.3-py2-none-any.whl
Collecting six (from bleach)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting html5lib<0.99999999,>=0.999 (from bleach)
Installing collected packages: six, html5lib, bleach
Successfully installed bleach-1.4.3 html5lib-0.9999999 six-1.10.0
<testing> $ python
Python 2.7.12 (default, Jul  1 2016, 15:12:24) 
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import bleach
>>> from html5lib.sanitizer import HTMLSanitizer
>>> 

That seems to work fine to me.

Am I doing something different in my steps to reproduce than you are?

matthiaskubik commented 8 years ago

@willkg double-checked. You're right. my bad. I figured that the later version of html5lib was tied in by Flask which was installed prior to bleach. I'll cancel my pull request.

jonathanmorgan commented 8 years ago

is there a separate ticket for updates to make bleach work with the new html5lib changes?

jonathanmorgan commented 8 years ago

Also, I just tried it, and on my machine, bleach doesn't work with 0.99999999 (eight 9s). The test you ran above looks like it installed bleach and html5lib-0.9999999 (seven 9s).

Successfully installed bleach-1.4.3 html5lib-0.9999999 six-1.10.0

If i misunderstand and the max version right now is seven 9s, please disregard.

sirivellamadhu commented 7 years ago

having the same issue when using cartridge mezzanine cms

python manage.py createdb --noinput /home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/Mezzanine-4.2.2-py3.5.egg/mezzanine/utils/conf.py:61: UserWarning: You haven't defined the ALLOWED_HOSTS settings, which Django requires. Will fall back to the domains configured as sites. warn("You haven't defined the ALLOWED_HOSTS settings, which " Traceback (most recent call last): File "manage.py", line 14, in execute_from_command_line(sys.argv) File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/django/core/management/init.py", line 367, in execute_from_command_line utility.execute() File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/django/core/management/init.py", line 341, in execute django.setup() File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/django/init.py", line 27, in setup apps.populate(settings.INSTALLED_APPS) File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/django/apps/registry.py", line 108, in populate app_config.import_models(all_models) File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/django/apps/config.py", line 199, in import_models self.models_module = import_module(models_module_name) File "/home/sys2/anaconda3/envs/web/lib/python3.5/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 986, in _gcd_import File "", line 969, in _find_and_load File "", line 958, in _find_and_load_unlocked File "", line 673, in _load_unlocked File "", line 665, in exec_module File "", line 222, in _call_with_frames_removed File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/Mezzanine-4.2.2-py3.5.egg/mezzanine/conf/models.py", line 7, in from mezzanine.core.models import SiteRelated File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/Mezzanine-4.2.2-py3.5.egg/mezzanine/core/models.py", line 24, in from mezzanine.core.fields import RichTextField, OrderField File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/Mezzanine-4.2.2-py3.5.egg/mezzanine/core/fields.py", line 14, in from mezzanine.utils.html import escape File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/Mezzanine-4.2.2-py3.5.egg/mezzanine/utils/html.py", line 18, in from bleach import clean, sanitizer File "/home/sys2/anaconda3/envs/web/lib/python3.5/site-packages/bleach-1.5.0-py3.5.egg/bleach/init.py", line 14, in from html5lib.sanitizer import HTMLSanitizer ImportError: No module named 'html5lib.sanitizer'

danchay commented 7 years ago

Ditto here. Same as sirivellamadhu.

bnisevic commented 7 years ago

Doesn't work with nine 9s.

kushaldas commented 7 years ago

from html5lib.sanitizer import HTMLSanitizerMixin This will be another issue.

willkg commented 7 years ago

@kushaldas and everyone else. This issue is closed. If you want to keep commenting on how bleach doesn't work with html5lib > 8 9s, then comment here: https://github.com/mozilla/bleach/issues/229