Closed MaxDall closed 4 months ago
I'm sorry, I just realized that this is a copy of #346 thus I want to expand my question:
As #346 suggested to pass strict_query=False
, basically disabling query parsing, are there plans to expand the existing implementation to avoid this and validate queries correctly again?
As https://github.com/python-validators/validators/issues/346 suggested to pass strict_query=False, basically disabling query parsing, ...
How so?
The used urllib.parse.parse_qs
function utilizes a boolean parameter called keep_blank_values
, as outlined in the documentation, to determine whether to retain empty query keys. One could propagate said parameter to users, or enable it by default.
Following my initial example (https://www.lrt.lt/?rss
)
urllib.parse.parse_qs("rss", keep_blank_values=True)
#output
{'rss': ['']}
If this suits you, I could open a PR.
Update:
Additionally, I would suggest replacing strict_query
with keep_blank_values
, as the former suppresses part of the validation, which feels kinda contradictory to the library's purpose, but that's up to you :)
Update II:
Just realized that setting keep_blank_values=True
still treats it as a parsing error thus necessitating strict_parsing=False
to work. I'm gonna reach out to urllib
to see if I can do something there.
There is already an issue open in cpython
referencing this problem.
Just realized that setting keep_blank_values=True still treats it as a parsing error thus necessitating strict_parsing=False to work. I'm gonna reach out to urllib to see if I can do something there.
URL, already has 9 keyword and 1 positional parameter, I'm afraid adding more would clutter the function definition.
Closing this for now. Will reopen, if https://github.com/python/cpython/issues/110843 is resolved.
Hey
validators
team :)Thanks for the great work you do in maintaining this package 👍.
I've noticed a significant change regarding URL validation in the recent updates, particularly with 5f6437c, where the parsing of query strings underwent a complete overhaul. Specifically, the package now utilizes urllib's query parsing method, resulting in URLs containing query strings with keys but without values (e.g., https://www.lrt.lt/?rss) being flagged as invalid. Given that URLs of this nature are commonly encountered, and according to RFC-3986, the presence of values for keys is not required, I'm curious to know if this change was intentional or merely a side effect.