caolan / forms

An easy way to create, parse and validate forms in node.js
MIT License
1.01k stars 167 forks source link

Use WHATWG URL for url and email validation #203

Open stevenvachon opened 7 years ago

stevenvachon commented 7 years ago

Via universal-url as it covers far more edge cases such as IDNAs and IPv6 than a simple regex will.

const isEmail = email => {
  try {
    const url = new URL(`mailto:${email}`);
    return url.search === '';
  } catch (error) {
    return false;
  }
};

const isURL = url => {
  try {
    url = new URL(url);
    return url.protocol === 'http' || url.protocol === 'https' || url.protocol === 'ftp';
  } catch (error) {
    return false;
  }
};
ljharb commented 7 years ago

In order to do that, we'd have to ship the entire URL polyfill to browsers - or require that users do it. The polyfill still seems prohibitively large.

(also, in general, email validation that's anything more than "it has an @, and you can receive an email sent to it" is useless, so I'm not interested in making email validation "better", when it needs to be reduced, not increased)

stevenvachon commented 7 years ago

URL is correct and complete while this library's regex is neither. File size concerns are addressed via universal-url-lite.

Checking an email address for a correct IDNA is no less useful than checking a URL for the same. Email servers adhere to the same IP and DNS rules as a web server.

ljharb commented 7 years ago

They're slightly different, a@b is a valid email address. The only way you can correctly validate an email address is to send it an email with a secret, and verify receipt of the secret.

To clarify: does universal-url include 100% of the unicode compliance it needs, and then universal-url-lite only fails on IDNA URLs?

stevenvachon commented 7 years ago

mailto:a@b is a valid URL as well, but I would think that a@:*&^: is not a valid email address. However: https://github.com/jsdom/whatwg-url/issues/98

universal-url is, to my knowledge, 100% TR46 compliant. universal-url-lite has two shims, one with incomplete TR46 and one that relies on the native browser implementation, which all are currently incomplete.

ljharb commented 7 years ago

It would help if you could provide sets of non-IDNA test cases that pass with your shim, but fail with the current implementation - that would help me understand the risk/reward.

stevenvachon commented 7 years ago
require("forms/lib/validators").url(false)(null, {
  data: "http://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]"
}, result => console.log(result))
//-> Please enter a valid URL.
ljharb commented 7 years ago

Thanks - so, these are the categories of URLs that using universal-url would allow to be valid:

These are the categories of URLs that currently already work:

Any others?

stevenvachon commented 7 years ago

This library currently only supports http, https and ftp.

ljharb commented 7 years ago

OK, I've edited my list above. Any other categories?

stevenvachon commented 7 years ago

That's all I can think of at the moment.