falconry / falcon

The no-magic web data plane API and microservices framework for Python developers, with a focus on reliability, correctness, and performance at scale.
https://falcon.readthedocs.io/en/stable/
Apache License 2.0
9.51k stars 937 forks source link

Special characters are wrongly encoded/decoded #2173

Closed vpas88 closed 1 year ago

vpas88 commented 1 year ago

I have a small application using falcon-api, URI-s can have special characters like:

íéáűúőóüö

When I send above URI, it's percent-encoded to:

%C3%AD%C3%A9%C3%A1%C5%B1%C3%BA%C5%91%C3%B3%C3%BC%C3%B6

I use gunicorn, gunicorn pass this to falcon as:

íéáÃ

After it's encoded+decoded in request.py here (I guess):

if not isascii(path):
            path = path.encode('iso-8859-1').decode('utf-8', 'replace')

The result is:

±ÃºÃ

Instead of the original:

íéáűúőóüö

vytas7 commented 1 year ago

Hi @vpas88! I've converted this issue to a discussion since I cannot really reproduce any of the above. By the referenced line, Falcon is merely following the PEP 3333 convention to decode tunneled bytes from Latin-1.