Open d3804163-a08b-44bd-b366-736a895f9371 opened 8 years ago
The WSGI reference implementation does not provide any means for application code to distinguish between the following request lines:
GET /foo/bar HTTP/1.1
GET /foo%2Fbar HTTP/1.1
Now, the relevant RFC-1945 (https://tools.ietf.org/html/rfc1945#section-3.2) does not explicitly state how these should be handled by application code, but it does clearly distinguish encoded from unencoded forward-slashes in the BNF, which suggests that percent-encoded slashes should be considered part of a path segment, while unencoded slashes should be considere segment separators, and thus that the first URL is supposed to be interpreted as ['foo', 'bar'], but the second one as ['foo/bar']. However, the 'PATH_INFO' WSGI environ variable contains the same string, '/foo/bar', in both cases, making it impossible for application code to handle the difference. I believe the underlying issue is that percent-decoding (and decoding URLs into UTF-8) happens before interpreting the 'PATH_INFO', which is unavoidable because of the design decision to present PATH_INFO as a unicode string - if it were kept as a bytestring, then interpreting it would remain the sole responsibility of the application code; if it were a fully parsed list of unicode path segments, then the splitting could be implemented correctly.
Unfortunately, I cannot see a pleasant way of fixing this without breaking a whole lot of stuff, but maybe someone else does.
It's also very possible that I interpret the RFC incorrectly, in which case please enlighten me.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = None closed_at = None created_at =
labels = ['type-bug']
title = 'wsgiref simple_server PATH_INFO treats slashes and %2F the same'
updated_at =
user = 'https://bugs.python.org/tdammers'
```
bugs.python.org fields:
```python
activity =
actor = 'ned.deily'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation =
creator = 'tdammers'
dependencies = []
files = []
hgrepos = []
issue_num = 28355
keywords = []
message_count = 1.0
messages = ['278032']
nosy_count = 2.0
nosy_names = ['pje', 'tdammers']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue28355'
versions = ['Python 3.4']
```