Closed McAnix closed 4 years ago
the following snippet should resolve your missunderstanding of how the library works
<?php
$manager = new Manager(new Cache(), new CurlHttpClient());
$rules = $manager->getRules();
echo $rules->resolve('toto.linode.com', Rules::PRIVATE_DOMAINS)->getPublicSuffix(), PHP_EOL;
echo $rules->resolve('toto.linode.com', Rules::ICANN_DOMAINS)->getPublicSuffix(), PHP_EOL;
echo $rules->resolve('toto.linode.com')->getPublicSuffix(), PHP_EOL;
the result is the following:
linode.com
com
linode.com
the explanation is simple.
linode.com
is registered as a private domain. By default, the library uses the longest public suffix it can find for a given domain name. In your example, the longest public suffix is the domain itself hence it returns the empty string as stated in the documentation.
Depending on your business logic you should explicit against which section you want your domain name to be resolved against. Both results are correct you just need to adjust your requirement to distinguish which is fine for your use case.
Also this question is already answered in the pinned issue #240
I understand the intricacies of private vs ICANN TLDs. However this doesn't solve the fundamental user problem.
The user is asking the domain parser for the domain parts. The user is not aware of which company recently added its domain to an unofficial TLD suffix list. The user just wants to check whois for the registrable domain, or see which registry (TLD) the domain is registered under.
Why is it now the developer's problem to write a wrapper class to parse out "failures" so that the user gets what they expect?
That's because from your business point of view the default behaviour which is inline with how the PSL test suite works is not what you want.
$domain = $rules->resolve('linode.com');
The PSL was first created to resolve cookie issues and in this regards the default behaviour is better suited for that.
$domain = $rules->resolve('linode.com', Rules::ICANN_DOMAINS);
is what, most of the time, developer should be using in regards to domains resolution against TLD suffixes. This should maybe be more explicit in the documentation.
The PSL was first created to resolve cookie issues and in this regards the default behaviour is better suited for that.
Ah ha! Thank you for that explanation. We use it predominantly to determine the registrable domain. I have adjusted all lookups to include ICANN_DOMAINS.
Issue summary
Resolving a domain like 'co.za' or 'linode.com' results in a blank domain object, these both exist in some form or other on the public suffix list. However these are perfectly valid domain names and have websites attached. Adjusting the rules parameter "section" to ICANN_DOMAINS or PRIVATE_DOMAINS works but only for 'linode.com' and 'co.za' respectively.
System informations
Standalone code, or other way to reproduce the problem
use Pdp\Cache; use Pdp\CurlHttpClient; use Pdp\Manager; use Pdp\Rules;
$manager = new Manager(new Cache(), new CurlHttpClient()); $rules = $manager->getRules() ->withAsciiIDNAOption(IDNA_NONTRANSITIONAL_TO_ASCII) ->withUnicodeIDNAOption(IDNA_NONTRANSITIONAL_TO_UNICODE); $domain = $rules->resolve('linode.com'); echo 'Domain: \'' . $domain->getContent() . '\'';
Expected result
Domain: 'linode.com'
Actual result
Domain: ''