Open gissuebot opened 10 years ago
Original comment posted by kevinb@google.com on 2013-12-18 at 06:12 PM
(No comment entered for this change.)
CC: cberry@google.com
We got a report internally (just days after I opened this bug) that the TLD list was changing to include all the IANA Root Zone Database. That seems to be the case, or at least it seems to be close (maybe differing just by lagging a little?):
$ wget http://www.iana.org/domains/root/db ... $ wget http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1 ...
$ comm -1 -3 <(sed -e 's#//.##' -e 's/.[.]//' effective_tld_names.dat\?raw\=1 | sort -u) <(egrep -o '/domains/root/db/\w+.html' db | egrep -o '\w+[.]' | tr -d . | sort) bl bq eh mf movie plus ss tech tickets um
There is also a question of whether proposed names should be accepted. I think that "proposed names" may be those at http://icannwiki.com/All_New_gTLD_Applications (that aren't WITHDRAWN?). But I need to look into this.
Semi-related: If we ever decide to more heavily design the public-suffix support of InternetDomainName
, we should glance at the API used by https://github.com/whois-server-list/public-suffix-list
Here's what I've learned:
There are successively more restrictive checks that we could offer for a TLD:
InternetDomainName
will recognize it.any updates on this issue? Thank you
Original issue created by cpovirk@google.com on 2013-12-18 at 05:30 PM
""" The issue appears to be in the API of InternetDomainName.findPublicSuffix() - https://github.com/google/guava/blob/ab29b173055a1ff647516848b176265fc6792ba0/guava/src/com/google/common/net/InternetDomainName.java#L167
The issue appears to be that this class is disregarding Step 2 of "The Algorithm", described at http://publicsuffix.org/list/ - that is, "If no rules match, the prevailing rule is *".
In this model, any domain not on the list is assumed to be registerable at the second level. For example, "au" is not included in the PSL. This should cause "foo.au" to fail to match any rules, and thus fall into the default wildcard rule. In the default wildcard rule, the public suffix is ".au" - and CSIRO is treated as a registerable name.
This is especially important with the many new registries that ICANN is approving; a decision has not been made to automatically add them to the PSL, and so I fear this may cause issues for Java applications in validating these domains.
If the goal is to ensure a name is "valid" (that is, assigned/approved by ICANN), then IANA has a data file that is updated twice daily at http://data.iana.org/TLD/tlds-alpha-by-domain.txt that contains all IANA-assigned gTLDs. It may make sense to incorporate this data into the PSL trie to have a proper "fail open" behaviour.
...
For plausability checks, then the IANA list is a much better resource, for sure. For security checks, the PSL is the best source of data for this.
...
The point of the PSL is not to replace the IANA list but to further reduce scope of registerable labels.
There would be no benefit to the PSL's including the full IANA list, and real performance harm, since step 2 of the algorithm implicitly covers these domains. """
What would change in InternetDomainName? I would want to talk more to the original bug reporter and to others, but here are some guesses: