voxpupuli / json-schema

Ruby JSON Schema Validator
MIT License
1.53k stars 242 forks source link

Does not validate email correctly... #269

Closed neilyoung closed 8 years ago

neilyoung commented 8 years ago

...or I'm making a mistake

2.0.0-p247 :013 > schema = { "type" => "object", "properties" => { "email" => {"type"=>"email"}}} => {"type"=>"object", "properties"=>{"email"=>{"type"=>"email"}}} 2.0.0-p247 :014 > JSON::Validator.validate(schema, {"email"=>"a"}) => true 2.0.0-p247 :015 >

EDIT: My schema wasn't correct. But even with the correct schema it returns true

2.0.0-p247 :003 > require 'json-schema' => true 2.0.0-p247 :004 > schema = { "type" => "object", "properties" => { "mail" => {"format"=>"email", "t ype"=>"string"}}} => {"type"=>"object", "properties"=>{"mail"=>{"format"=>"email", "type"=>"string"}}} 2.0.0-p247 :005 > JSON::Validator.validate(schema, {"mail"=>"x"}) => true 2.0.0-p247 :006 >

RST-J commented 8 years ago

Correct syntactic validation of email addresses as per RFC is pain in the ass which is why we decided not to support it (respectively to not add an dependency for solely this task). You find a list of all supported format keywords at the end of the README.

You can provide your own implementation for the email format keyword using a custom format validator (which is also explained in the README ;) ).

neilyoung commented 8 years ago

Ok, accepted. Wondering that http://jsonschemalint.com/draft4/ can. Maybe something to copy there...

RST-J commented 8 years ago

Well, it can't. It checks for a string of at least 3 characters having an @ sign somewhere. So it would accept a@b which is not a valid email address.

Now you could argue that this still is "good enough". And there you get into the world of opinions where everyone can legitimately have an own. And so we decided to not to provide a subset of email validation and leave it up to the user to add a variant of his/her "good enough".

If you look here and into the spec you'll get an idea of how insane the format of a valid email is. I'd bet that there are even mail servers which do not accept weird variants of valid emails.

You should also think about the purpose you are validating an email for. For example the top answer in the stackoverflow link: Just send an email to whatever you get and if someone clicks the link, you know it works. That is not always the problem you have, but for confirmation it reveals something: Actually you don't need to care about the syntax, if it gets somewhere and someone clicks on the link, it obviously works regardless of any spec.

neilyoung commented 8 years ago

It is good enough. I'm using this ugly regexp usually (java code) and it works in 99% of known cases. The rest should go to hell or go for a better eMail. You will never achieve 100%

public Email(String email) {
    /* RFC 2822 compliant check */
    Pattern p = Pattern.compile("^[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$", Pattern.CASE_INSENSITIVE);
    Matcher m = p.matcher(email);
    if (m.find()) {
        this.email = email.toLowerCase();
    }
}