python-validators / validators

Python Data Validation for Humans™.
MIT License
958 stars 152 forks source link

[feature] add non-exhausive URI validation #352

Closed yozachar closed 2 months ago

yozachar commented 5 months ago

Suggestions are welcome:

https://github.com/python-validators/validators/blob/cdc987d4b12680fc6469825089d567ec5841f997/src/validators/uri.py#L24-L41

tofetpuzo commented 2 months ago

are you taking any new suggestions?

yozachar commented 2 months ago

Constructive ones, please go ahead.

tofetpuzo commented 2 months ago

--- # Define a list of valid URI schemes => VALID_SCHEMES = ['http', 'https', 'ftp', 'mailto', 'file']

  1. We could make the parameter-> value(str) flexible perhaps not restrict it to a type str, could be src/path/to/file; which could contain different urls
  2. do a recursive search(regex expression perhaps) to match pattern
tofetpuzo commented 2 months ago

is it something we should do?

yozachar commented 2 months ago

Something similar yeah.

tofetpuzo commented 2 months ago

can I take up the challenge :)?

yozachar commented 2 months ago

Sure, go ahead.

tofetpuzo commented 2 months ago

Hey @yozachar ,can we use socket.inet_aton(ip) in python to validate ip addresses as a function then update the schema in uri.py?

   # URL-based schemes
    if any(
        value.startswith(item + "://") for item in {
            "ftp", "ftps", "git", "http", "https",
            "irc", "rtmp", "rtmps", "rtsp", "sftp",
            "ssh", "telnet", "gopher", "ldap", "sip",
            "nfs", "mqtt", "smb", "udp"
        }
    ):
yozachar commented 2 months ago

IP addresses are already validated in ip_address.py. Why use socket.inet_aton?

tofetpuzo commented 2 months ago

Hi, that is cool I looked at the message thread, something about this error below , was why I thought this feature was asked for I guess, see code below.

import validators

streamurl = "rtmp://192.168.1.123:1935/live/test"
print(validators.url(streamurl))

Output: ValidationError(func=url, args={'value': 'rtmp://192.168.1.123:1935/live/test'})

Or does it mean we just want to validate other schemas? like the ones few ones below, that I could not find in uri.py

proposed schemas

"Uri":  {
"udp": udp://192.168.1.1:1234/path;param=value?,

"gopher": gopher://gopher.example.com/1/path;type=1?search#frag,\
"ldap": ldap://ldap.example.com/cn=John%20Doe,dc=example,dc=com;scope=one?sn#frag, \
"sip":sip://user:password@sip.example.com/path;transport=tcp?subject=Hello#frag, \
"smb": smb://fileserver.example.com/share/path;param=value?query=1#fragment\
 }

Few Thoughts

  1. Have a regex expression : that complies the schema above.
  2. uri_regrex: re.compile() e.g
    # Validate using regex for URL-based schemes
    if uri_regex.match(value):
        return True