egulias / EmailValidator

PHP Email address validator
MIT License
11.45k stars 210 forks source link

Are long dashes (—) allowed in domain name? #296

Open astehlik opened 3 years ago

astehlik commented 3 years ago

I recently stumbled over an issue with (possibly invalid) dashes in domain names:

some-name@with—invalid—dashes.de

This validator accepts the address as valid.

The mailserver disagrees. When sending an email to this address I get an error:

501 5.1.3 Bad recipient address syntax

Also: the native PHP filter_var function does not accept the address.

So who is right here?

Cheers Alex

egulias commented 3 years ago

Hello @astehlik , Depends on the version of the library you are using. V2.x should mark it as valid since email RFCs are more open in the chars allowed in the domain. V3.x should mark it as invalid because it validates against RFC1035. filter_var is a bad comparision since even test@toplevel and uft8 would also be marked as invalid. That's why this library exists.

Which is the case?

astehlik commented 3 years ago

Thank you for the feedback @egulias

I updated to version 3.1 and tested again:

$validator = new \Egulias\EmailValidator\EmailValidator();
// Returns true
$validator->isValid('some-name@with—invalid—dashes.de', new \Egulias\EmailValidator\Validation\RFCValidation());
// Returns also true
$validator->isValid('some-name@with—invalid—dashes.de', new \Egulias\EmailValidator\Validation\NoRFCWarningsValidation());

So normally this should return false?

Cheers Alex

egulias commented 3 years ago

Hi @astehlik , looks like you have found a bug. Yes, should be false. I belive it will be related to encoding. Could you provide the UTF-8 code for it? Thanks.

astehlik commented 3 years ago

Thank you for the info @egulias

I tested it with the the so called em dash with Unicode U+2014 and HTML entity —

See also: https://en.wikipedia.org/wiki/Dash#Common_dashes_and_their_Unicode_code-points

I hope this helps.

Cheers Alex