Closed HainLuud closed 3 weeks ago
maybe you could share a minimal template which triggers this error? I am wondering why we did not notice that before.
It's a auto-generated fuzzing input so not too much logic here, but an example input would be 8zwhWz4=
in base64
import base64
from genshi import HTML
from genshi.filters import HTMLSanitizer
inp = base64.b64decode("8zwhWz4=")
markup = HTML(inp) | HTMLSanitizer()
@HainLuud Thank you for reporting this. I've added a simple fixed and a slightly simpler version of the test you wrote.
@FelixSchwarz I guess this never came up because triggering an exception here is quite hard. The Python HTMLParser (when not in strict mode) accepts almost anything as valid HTML as long as its valid text.
At line 349 of
input.py
the exception handler tries to access html.HTMLParseError, an error class, that used to exist in the cPython's html library but has been removed since Python 3.3.The genshi code in question is this: