io-developer / php-whois

PHP WHOIS provides parsed and raw whois lookup of domains and ASN routes. PHP 8.0 compatible (5.4+ old versions)
MIT License
438 stars 117 forks source link

Autoconfig feature #37

Open TimSparrow opened 6 years ago

TimSparrow commented 6 years ago

Description

php-whois configuration for TLDs is centrally distributed, changes in whois servers' config have to be reported as issues, then committed to the config file, then redistributed via git/composer.

However, to find out a whois server name for a particular TLD, one can use the following shell script:

#!/bin/bash
tld=$1
# error checking omitted for demonstration purposes
dig +short CNAME $tld.whois-servers.net

Example

Implement a TldServer::getWhoisServer($tld) method that returns a server name using this command (use php system call for that)

Implement a TldServer::getTldList method to get the TLD list from the official site http://data.iana.org/TLD/tlds-alpha-by-domain.txt

Implement a updateConfig($tld) method (not sure which class it belongs to), that allows to query the server for an individual tld, or the whole list.

Make config file non-distributable, keep only sample file and possible parser-to-tld configuration

io-developer commented 6 years ago

Wow! Very good feature. But I can implement it later (after ~2 weeks)

io-developer commented 6 years ago

Currently you can use workaround with:

$whois = Whois::create();
$servers = $whois->getServers() (\Iodev\Whois\Modules\Tld\TldModule ::getServers())
..find and remove bad TLDs from list..
..add to $servers properly configured (\Iodev\Whois\Modules\Tld\TldServer::from...)..
..set new server list for your Whois instanse..
$whois->setServers($servers) (\Iodev\Whois\Modules\Tld\TldModule ::setServers())
TimSparrow commented 6 years ago

Thanks for the response! It is not something really urgent, and I doubt one can use the workaround on every whois query, as it is quite time-consuming.

The suggestion is based on the fact that sometimes php-whois picks up a wrong server for a particular TLD, and central management for TLDs is not very efficient. Then I ran across the answer in stackExchange on how to find a current whois server for a particular TLD.

io-developer commented 6 years ago

Your suggestion is OK.

and I doubt one can use the workaround on every whois query,

What you mean? Do you create every time for query a new instance of Whois like this Whois::create()->querySomething() instead of using a single properly configured instance outside of code with queries?

TimSparrow commented 6 years ago

I am using a single property, as in my quoted example: $this->whois->doSomething();

What I mean is that config cannot be validated on every run, as it is too resource and time consuming. I suggest to have some sort of a config-cache, which can be updated or completely rebuilt anytime, leaving the TLD to server matching up to the end user, not maintaining it centrally any more.

io-developer commented 6 years ago

I see. OK

Notes:

$server = TldServer::fromData(['zone' => '_', 'host' => 'whois.iana.org']); $parser = $server->getParser(); $mod = Whois::create()->getTldModule();

foreach ($tlds as $tld) { $info = $parser->parseResponse($mod->loadResponse($server, $tld)); if ($info) { var_dump([ $tld => $info->getWhoisServer() ]); } }

Steinweber commented 5 years ago

php -r 'var_dump(dns_get_record("be.whois-servers.net"));'

dns_get_record($host,$type=DNS_ANY) RFC8482 deprecates the DNS ANY query type.

I think it is better to use DNS_CNAME var_dump(dns_get_record("be.whois-servers.net",DNS_CNAME));

array(1) {
  [0]=>
  array(5) {
    ["host"]=>
    string(20) "be.whois-servers.net"
    ["class"]=>
    string(2) "IN"
    ["ttl"]=>
    int(585)
    ["type"]=>
    string(5) "CNAME"
    ["target"]=>
    string(12) "whois.dns.be"
  }
}
io-developer commented 5 years ago

Thanks! Actually I used this case in dev branch (https://github.com/io-developer/php-whois/commits/tld_serverlist_updating) But I was stuck with performance issue and some thoughts:

  1. Some lazy background process needed due to very very long step by step updating of each TLD (some not responding, not supporting, etc.)
  2. Or needed on-fly updating when request to TLD happens at first time or after some intervals..

Here some code used for testing:

$tlds = TldHelper::loadTldList();
var_dump([
    'tlds' => $tlds,
]);

$server = TldServer::fromData(['zone' => '_', 'host' => 'whois.iana.org']);
$parser = $server->getParser();
$mod = Whois::create()->getTldModule();

foreach ($tlds as $tld) {
    $info = $parser->parseResponse($mod->loadResponse($server, $tld));
    if ($info) {
        var_dump([ $tld => $info->getWhoisServer() ]);
    }
}
Steinweber commented 5 years ago

amphp and reactphp have a lot of functions for async and child process.

amphp/parallel amphp/artax reactphp

I think the best way is a public github list with daily automatic push. /whois/data.json /whois/hash.md5 (Or sha1,...) /whois/version.txt

if contentOf(whois/version.txt) > getMyLocalVersion()
  updateWhoisFromGit(whois/data.json,$exclude=[],$include=[]);

exclude is nice to "not update" certain tld's. include can contain a own local list of tld's. Do not overwrite. Merge the lists: $newList = array_merge($oldList,$newGitList,$include)

vesper8 commented 5 years ago

very excited to see this feature added to the master branch

since updating the list is time-consuming, would it make sense to keep a globally updated list somewhere that updates once a day so the rest of us can pull in through a simple get json request?

io-developer commented 5 years ago

Actually loading tld list can be done in current version with minor overhead: You could load list in any format and prepare data for $whois->setServers()

I don't like idea to host and update this custom server list in github, because it's needed to write some kind of bot and cron. Looks like reinventing the wheel because I've seen many lists like that in other repositories (and some of these are dead or not updating anymore) and it's not guarantee this list are actual and valid.