codemonkey85 / isemail

Automatically exported from code.google.com/p/isemail
0 stars 0 forks source link

Feature: Add TLD check #2

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Additionally, a official static TLD list of IANA can be checked. (Java + PHP 
solution)

It is questionable, if this additional check should be enabled by an parameter 
or not. I think it would be better if the TLD-check should not be deactivated.

Note: For maintenance there should be a function which is called by developers 
which checks if the official TLD list ( 
http://data.iana.org/TLD/tlds-alpha-by-domain.txt ) is equal to our static 
list. So, we can immediately change our source code when we notice that IANA 
has assigned a new TLD. (Maybe also on-demand-refreshing of the internal TLD 
table...?)

Original issue reported on code.google.com by danielma...@googlemail.com on 14 Jun 2010 at 9:41

GoogleCodeExporter commented 9 years ago

Original comment by dominic....@gmail.com on 15 Jun 2010 at 8:00

GoogleCodeExporter commented 9 years ago
For backwards compatibility I guess we could look for this list and use it if 
it's available. If it's not available then we allow any TLD.

We don't need to implement our own caching I don't think - HTTP will do this 
for us.

Original comment by dominic....@gmail.com on 13 Aug 2010 at 1:19

GoogleCodeExporter commented 9 years ago
Well, I will start later in adding a TLD check in PHP and Java code. In Java it 
is easy to check against a list. In PHP I will figure out which works best and 
then send you a Patch.

A big problem with the TLD thingy is that most testcases will now fail, since 
they had no valid TLDs.

Example: "valid@provider.xyz" (not in your testcase) was valid and would not no 
more valid.

I would have a solution for this: We could append to all these cases a ".com", 
so "valid@provider.xyz.com" would then be valid again. But there is still a 
risk: we have now changed "provider" from a domainname to a sub-domainname and 
"xyz" is now the domainname and not the TLD. This could change the semantic of 
the testcases.

Another idea would be to disable TLD checks for some testcases. (I might extend 
the XML file then) So, we would have simple checks and checks for the TLDs 
itself.

What do you think?

Original comment by danielma...@googlemail.com on 8 Sep 2010 at 5:32

GoogleCodeExporter commented 9 years ago
Another decision for you: I wonder what an invalid TLD should count - as 
warning or as error?

* From the technically point of view, a invalid TLD is still OK against RFC.
* On practical point of view, an mail adress with a TLD which is not assigned 
by the IANA is really IMPOSSIBLE, so it is rather an error than a warning.

Original comment by danielma...@googlemail.com on 8 Sep 2010 at 5:48

GoogleCodeExporter commented 9 years ago
I considered this feature for version 2 and decided that the DNS check was 
sufficient. If this feature is implemented at a later date then I recommend a 
Warning is raised, not an Error.

Two reasons for this: firstly it would allow for development testing to 
continue without continually raising errors for somebody@locahost. Secondly it 
allows for dark nets that don't use the IANA TLDs (somebody@domain.ano for 
instance).

Original comment by dominic....@gmail.com on 8 Sep 2010 at 8:34

GoogleCodeExporter commented 9 years ago

Original comment by dominic....@gmail.com on 8 Sep 2010 at 8:36

GoogleCodeExporter commented 9 years ago
Hello.

I don't share your opinion here. A DNS check is an optional feature since it 
interacts with the network. Not everyone likes that - especially when having a 
big batch script that checks thousands of addresses.

The TLD check I want to add checks against an internal table without network 
interactivity and would be ALSO optional via parameters. So, the tests would 
not fail. And also, things like "localhost" or darknet would also pass since it 
can be disabled. Please note, that for web-servers it makes no sense to allow 
darknets or localhost. If e.g. a PHP webpage wants to check the adress of a 
visitor, then we cannot allow "hello@localhost" since we want "internationalal" 
addresses.

I highly recommend to add the TLD check with cached tables, so that 
offline-check can be improved.

Original comment by danielma...@googlemail.com on 8 Sep 2010 at 6:10

GoogleCodeExporter commented 9 years ago
Quote from email:

"Hi Daniel,

If you propose to check against a static (internal) list of TLDs then I
strongly disagree. New TLDs are being added at an increasing rate - will you
include .xxx in your list? It is invalid now but may become valid very soon,
making your function less useful. The only sensible way to do this is to do an
online check against the IANA list at run time and then there is no real
advantage against a full DNS check.

The underlying purpose of is_email is to check for validity against the
relevant RFCs. Checking of actual domains or TLDs is a nice-to-have that
shouldn't be included if it guarantees future obsolescence."

OK, even if I thought IANA TLDs are very very rarely added, I can accept this 
"WontFix" now. In case we are talking about an "online-check" (and no internal 
TLD-table as I supposed), DNS-check indeed indirectly checks the TLD implicitly.

Original comment by danielma...@googlemail.com on 9 Sep 2010 at 9:06

GoogleCodeExporter commented 9 years ago
There hasn't been much change since 2005, when .cat, .jobs, .mobi, .tel, 
.travel were added. Also East Timor changed from .tp to .tl

Here's a list of proposed new TLDs: 
http://en.wikipedia.org/wiki/Proposed_top-level_domain - there's quite a lot!

The one nearest to approval is .xxx but the others will probably follow in the 
next few years.

Original comment by dominic....@gmail.com on 9 Sep 2010 at 9:19

GoogleCodeExporter commented 9 years ago
Also .post might make an appearance anytime now: 
http://en.wikipedia.org/wiki/.post

Original comment by dominic....@gmail.com on 9 Sep 2010 at 9:28