voxpupuli / json-schema

Ruby JSON Schema Validator
MIT License
1.52k stars 242 forks source link

String "uri" format validation doesn't work #353

Closed k-rudy closed 7 years ago

k-rudy commented 7 years ago

As stated in Readme - uri string format belongs to the list of standard supported validations. But it doesn't work:

require "json-schema"

schema = { "type": "string", "format": "uri" }
JSON::Validator.validate(schema, 'invalid uri string') # this returns true, but provided value is not a valid uri

To make sure this is only an issue with uri and not with date-time for example, I have checked:

schema = { "type": "string", "format": "date-time" }
JSON::Validator.validate(schema, 'not a valid date') # returns false as expected
iainbeeston commented 7 years ago

The problem here is that validating URIs is extremely difficult. In practice almost anything can be a valid uri. If you tried validating something even more outrageous (try "@!£&&&--/" perhaps) it will be invalid.

The underlying implementation relies on the addressable gem to validate URIs. If addressable can parse the string to a URI, then we consider it valid. If it can't then it's invalid.

I suspect your issue might be better raised with the addressable gem, but as I mentioned, almost any string can be a valid URI.

jherdman commented 5 years ago

I hate to bring this up again, but I think this needs to be revisited. Looking at the current code for Addressable::URI.pase we can see that literally ANY string is sent on through — even if it doesn't match the regexp it references. This is deeply surprising behaviour. It means that "I'm a turtle" counts as a valid URI, even your sample outrageous string.

Perhaps it's better to use Ruby's built in URI library? It's less permissive, but that means it's less surprising. Example IRB session:

> require 'uri'
> URI.parse("@!£&&&--/")
URI::InvalidURIError: URI must be ascii only "@!\u00A3&&&--/"
> URI.parse("i like turtles")
URI::InvalidURIError: bad URI(is not URI?): i like turtles

Further edifying discussion on why this may not be an ideal approach: https://github.com/sporkmonger/addressable/issues/161