Closed haavardw closed 5 months ago
Hi @haavardw, can you please provide a couple of use cases to consider?
We have a system where you can submit a URL and get the web-site analyzed and scanned for phishing, malware etc. The argument public=True
is used to prevent scanning of local resources, e.g. file:///etc/passwd or local cloud resources, which would open up for an attack on our servers. We are currently using 0.18.2 but would like to upgrade.
This article from Snyk is also apparently SEO optimized to appear for search results regarding python and URL validation: https://snyk.io/blog/secure-python-url-validation/
The code presented is essentially flawed as validators
believes the local URL is valid, and only fails due to an invalid kwarg, public
.
validation = validators.url("https://10.0.0.1", public=True)
if validation:
print("URL is valid")
else:
print("URL is invalid")
I read the source, and a comment in the source code suggests that public
is a parameter, though the arg does not appear in the code.
I came to GitHub to search for a commit or perhaps issues that describe when/why public
was removed.
Just mentioning this in case others find themselves here for the same reasons.
@haavardw you may be able to limit protocols or first check the url uses http
| https
explicitly before validation. Additionally it appears like file:///etc/passwd
would be seen as an invalid url by validators
.
precedes #325
Version 0.10.3 added the
public
argument tovalidators.url
which would return 'False' for obvious local urls. This was very useful, but the functionality seems to be broken, maybe since c43826c0dab6d23162b0564c448a31469331b16e . Even the tests are removed. Nothing is mentioned in the changelog, and I find no obvious workarounds...