23andMe / Yamale

A schema and validator for YAML.
MIT License
666 stars 88 forks source link

[Feature request] URL / URI built in validator #213

Open Amndeep7 opened 1 year ago

Amndeep7 commented 1 year ago

At minimum all this would need to do would be to check if the provided string is a url (i.e. urllib's urlparse function doesn't throw an error when trying to parse it). If we wanted to get fancy, we could expose the parsed components and allow regex against them (ex. to allow both http and https but not anything else ex. ftp as the protocol).

There are workarounds that one could use, i.e. using the regex validator and grabbing one of the many regex's floating around on the internet and hoping that it actually does correctly validate for a valid url, but that feels more sketchy of a solution than it needs to be.

nbaju1 commented 1 year ago

To validate any possible URL without specifying any patterns to help the validator would be a momentous task as there are a myriad of possible URLs/URIs that one could call valid. And if you end up needing to feed patterns to the validator it's very close to using regex anyways. As far as I know, urlparse will never throw an error as long as it receives a string as input.

I would definitely like this feature as well though, so hopefully the maintainers are more creative and optimistic than I am :)

cblakkan commented 1 year ago

To validate any possible URL without specifying any patterns to help the validator would be a momentous task as there are a myriad of possible URLs/URIs that one could call valid. And if you end up needing to feed patterns to the validator it's very close to using regex anyways.

Yeah, this is a good example of how convoluted it gets

As far as I know, urlparse will never throw an error as long as it receives a string as input.

To help illustrate your point:

>>> [urllib.parse.urlparse(url).hostname for url in ['really', 'anything.com', 'cat://will', 'work+just://fine.?', 'http://notveryhelpful?baz=bar']]
[None, None, 'will', 'fine.', 'notveryhelpful']

So there would definitely need to be constraints/configuration. I think Pydantic does a good job of enumerating what most people might find useful.

I would definitely like this feature as well though, so hopefully the maintainers are more creative and optimistic than I am :)

Right now we're pretty resource constrained but we'll gladly accept pull requests if someone is up to the challenge!