GoogleCloudPlatform / webapp2

webapp2 is a framework for Google App Engine
https://webapp2.readthedocs.org
Other
141 stars 63 forks source link

Handle UnicodeDecodeError on bad UTF-8 URLs #152

Open snarfed opened 4 years ago

snarfed commented 4 years ago

if you request a URL with bad UTF-8 escaped characters, eg /%D0%C2%BD%A8%CE%C4%BC%FE%BC%D0.rar, webapp2 crashes with a stacktrace like this:

Traceback (most recent call last):
  File "/env/lib/python3.7/site-packages/webapp2.py", line 1573, in __call__
    rv = self.handle_exception(request, response, e)
  File "/env/lib/python3.7/site-packages/webapp2.py", line 1567, in __call__
    rv = self.router.dispatch(request, response)
  File "/env/lib/python3.7/site-packages/webapp2.py", line 1301, in default_dispatcher
    route, args, kwargs = rv = self.match(request)
  File "/env/lib/python3.7/site-packages/webapp2.py", line 1241, in default_matcher
    match = route.match(request)
  File "/env/lib/python3.7/site-packages/webapp2.py", line 884, in match
    match = self.regex.match(unquote(request.path))
  File "/env/lib/python3.7/site-packages/webob/request.py", line 476, in path
    bpath = bytes_(self.path_info, self.url_encoding)
  File "/env/lib/python3.7/site-packages/webob/descriptors.py", line 70, in fget
    return req.encget(key, encattr=encattr)
  File "/env/lib/python3.7/site-packages/webob/request.py", line 165, in encget
    return bytes_(val, 'latin-1').decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 1: invalid continuation byte

evidently the exception itself is raised by WebOb, by design. details in https://github.com/Pylons/webob/issues/114. webapp2 should probably at least return HTTP 400 on bad URLs like these instead of letting WebOb's UnicodeDecodeError propagate all the way up and result in a 500.

thanks for maintaining webapp2 btw, it's great!

snarfed commented 4 years ago

also https://github.com/Pylons/webob/issues/161