python-validators / validators

Python Data Validation for Humans™.
MIT License
960 stars 152 forks source link

URL is validated as true if contains brackets or apostrophe signs #338

Closed mquartus closed 6 months ago

mquartus commented 6 months ago

Hello,

I am not sure if this is the desired behaviour so I just wanted to check with you. We have had an issue that came up where we wanted to fix a valid cross-site scripting vulnerability in our web application code where the following malicious code – including apostrophe (') and a round bracket sign ( ) )– was injected:

https://example.org?q=search');alert(document.domain);

But when tried to use the validators.url() function it accepts the above as True :

>>> validators.url("https://example.org?q=search');alert(document.domain);")
True

Apparently this should not happen. The desired behaviour of the url() routine is would be False in this case. Do I overlook something or is the above accepted? Let me know if I am missing something or you need further information. My version of validators is 0.23.2 on Python 3.9.6 .

Thank you, Miklos

yozachar commented 6 months ago

Hey, thanks for bringing this up. Internally validators.url() uses Python's urllib.parse.parse_qs function. The behavior of parse_qs changed in Python 3.9.2. See for yourself:

Python 3.9.2 (default, Mar 21 2024, 06:39:21) 
[GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import parse_qs
>>> parse_qs("q=search');alert(document.domain);", strict_parsing=True)
{'q': ["search');alert(document.domain);"]}
Python 3.9.1 (default, Mar 21 2024, 06:40:47)
[GCC 13.2.1 20230801] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from urllib.parse import parse_qs
>>> parse_qs("q=search');alert(document.domain);", strict_parsing=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/us-er/.local/share/mise/installs/python/3.9.1/lib/python3.9/urllib/parse.py", line 692, in parse_qs
    pairs = parse_qsl(qs, keep_blank_values, strict_parsing,
  File "/home/us-er/.local/share/mise/installs/python/3.9.1/lib/python3.9/urllib/parse.py", line 747, in parse_qsl
    raise ValueError("bad query field: %r" % (name_value,))
  raise ValueError("bad query field: %r" % (name_value,))
ValueError: bad query field: 'alert(document.domain)'

If you can tell me why, I may be able to resolve it.

mquartus commented 6 months ago

Ah OK thanks. Leave it then to upstream if this gets their way and they will probably fix it.