Open 3960998a-15e7-45ca-91ff-62b29fe7aaba opened 3 years ago
It is possible to get valid ParseResult from the urlparse function even for a non-numeric port value. Only by requesting the port it fails[1]. Would it be an improvement if _checknetloc[2] validates the value of port properly?
// code snippet
Python 3.8.5 (default, Jan 27 2021, 15:41:15)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import urlparse
>>> uri = 'xx://foo:bar'
>>> uri_parts = urlparse(uri)
>>> uri_parts.netloc
'foo:bar'
>>> uri_parts.port
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/urllib/parse.py", line 174, in port
raise ValueError(message) from None
ValueError: Port could not be cast to integer value as 'bar'
// code snippet
[1] https://github.com/python/cpython/blob/e1903e11a3d42512effe336026e0c67f602e5848/Lib/urllib/parse.py#L172 [2] https://github.com/python/cpython/blob/e1903e11a3d42512effe336026e0c67f602e5848/Lib/urllib/parse.py#L416
I guess moving port validation logic to parsing time is done as part of https://github.com/python/cpython/pull/16780
Treating this as bug in itself might be a better idea than waiting for a ipv6 scope introduction, which had few caveats.
Would it be an improvement if _checknetloc[2] validates the value of port properly?
Yes, we could check if it is an int. That should be sufficient.
Thank you for your swift response and your willingness to add port validation to _checknetloc.
I think the validation itself should compound both exceptional branches implemented in port[3]
I also think the validation logic should be ran as early as possible.
I gave it a shot and implemented it.
I appreciate any reviews: https://github.com/python/cpython/pull/25774
Got some ideas from https://github.com/python/cpython/pull/16780
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields: ```python assignee = 'https://github.com/orsenthil' closed_at = None created_at =
labels = ['3.10']
title = "urllib.parse.urlparse doesn't check port"
updated_at =
user = 'https://github.com/p-alik'
```
bugs.python.org fields:
```python
activity =
actor = 'miguendes'
assignee = 'orsenthil'
closed = False
closed_date = None
closer = None
components = []
creation =
creator = 'palik'
dependencies = []
files = []
hgrepos = []
issue_num = 43871
keywords = ['patch']
message_count = 5.0
messages = ['391238', '391242', '391265', '391282', '392577']
nosy_count = 5.0
nosy_names = ['orsenthil', 'vstinner', 'xtreak', 'palik', 'miguendes']
pr_nums = ['25774']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue43871'
versions = ['Python 3.10']
```