falconry / falcon

The no-magic web data plane API and microservices framework for Python developers, with a focus on reliability, correctness, and performance at scale.
https://falcon.readthedocs.io/en/stable/
Apache License 2.0
9.52k stars 944 forks source link

Better handling of invalid WSGI path encoding #2340

Open vytas7 opened 1 month ago

vytas7 commented 1 month ago

Split from #1685: as reported by @rokclimb15, in some cases the WSGI path may Unicode characters that don't comply with the WSGI spec. According to PEP 3333:

On Python platforms where the str or StringType type is in fact Unicode-based (e.g. Jython, IronPython, Python 3, etc.), all “strings” referred to in this specification must contain only code points representable in ISO-8859-1 encoding (\u0000 through \u00FF, inclusive). It is a fatal error for an application to supply strings containing any other Unicode character or code point. Similarly, servers and gateways must not supply strings to an application containing any other Unicode characters.

Per this definition, Falcon actually handles this correctly by exploding with an unhandled error, that is what a fatal error is.

However, if it is not too expensive to catch this error, maybe we could render an HTTP 400 response anyway, providing a helpful message explaining what was the actual problem? Or alternatively, bubble up an unhandled error, but provide a more helpful message explaining what exactly is going on (with a reference to the spec).

CaselIT commented 1 month ago

Shouldn't this more be a 500 error since something in the "backend" is not working as expected, since we are receiving a value outside of backend spec?

vytas7 commented 1 month ago

Yeah, or maybe just an instance of RuntimeError that has an easier to understand message explaining what the problem is. Because now people might think it is a bug in Falcon.

CaselIT commented 1 month ago

runtime error works too