Novik / ruTorrent

Yet another web front-end for rTorrent
Other
2.01k stars 408 forks source link

[PATCH] The GeoIP plugin should resolve hosts in parallel #765

Closed nicobubulle closed 10 years ago

nicobubulle commented 10 years ago

From karel.t...@gmail.com on September 27, 2012 06:16:56

gethostbyaddr() is terrible - it blocks for 20 seconds on IPs with misconfigured DNS.

This one is a bit hackish as it makes assumptions about packet format, but works with most resolvers. Add new conf.php option dnsResolver = '8.8.8.8'; Maybe make socket timeout configurable, too.

Index: lookup.php

--- lookup.php ( revision 2185 ) +++ lookup.php (working copy) @@ -15,6 +15,10 @@ $HTTP_RAW_POST_DATA = file_get_contents("php://input"); if(isset($HTTP_RAW_POST_DATA)) {

Original issue: http://code.google.com/p/rutorrent/issues/detail?id=767

nicobubulle commented 10 years ago

From novik65 on September 28, 2012 02:40:31

This doesn't work for me. In result i have only one record. I think, this is because you must read answer immediately after sending one packet.

nicobubulle commented 10 years ago

From ezzz...@gmail.com on September 28, 2012 13:54:12

Connected udp sockets should be buffered, at least on linux (my setup is debian6, 2.6.32). I think it simply expires on higher-latency connections before replies arrive. Try to increase the timeout or receive buffer size:

socket_set_timeout($dns, 5, 0); socket_set_option($dns,SOL_SOCKET,SO_RCVBUF,65536);

Another possibility is your DNS server sends unexpected replies, or their DNS has some obscure setup (in particular, CNAME->PTR are not handled).

nicobubulle commented 10 years ago

From novik65 on September 29, 2012 01:38:06

Try to increase the timeout

BTW in your code you doesn't check - if a timeout occur. But i increase it.

socket_set_option($dns,SOL_SOCKET,SO_RCVBUF,65536);

PHP Fatal error: Call to undefined function socket_set_option. It disabled in my system. As result - it may be disabled in users system too.

Yet once. 1) If i send packets for 2 peers - then i got host for one of its only. 2) If i send packets for 5 peers - then i got hosts for 4 of its only. 3) If i send packets for 10 peers - then i got hosts for 9 of its only. Code with gethostbyaddr give all hosts (2, 5 and 10) in all cases.

Default recv buffer size is 4096. Response size (in the first case) - 90. As result - buffer size hasn't relation to this.

Another possibility is your DNS server sends

I test you code for 2 different systems (CentOS 5.4 and OnenSuse 11.2) with 2 different DNS servers. Results are the some.

nicobubulle commented 10 years ago

From karel.t...@gmail.com on September 29, 2012 16:40:25

BTW in your code you doesn't check - if a timeout occur. But i increase it.

Once the socket timeouts, fread() returns empty or incomplete buffer.

Code with gethostbyaddr give all hosts (2, 5 and 10) in all cases.

I'm observing the same thing - there is off-by-one error somewhere, I'll investigate. Thanks for testing!

nicobubulle commented 10 years ago

From karel.t...@gmail.com on September 29, 2012 17:59:15

Ok, here goes cleaned up version. Results seem to be the same as for ghbn even for large (700 peers) torrents.

Added support for compressed names in the result set, it seems to be used quite a bit.

conf.php: <?php // configuration parameters

$retrieveCountry = true;
$retrieveHost = true;

// use faster internal resolver instead of gethostbyaddr().
$dnsResolver = '8.8.8.8';

// give up resolving host names after this number of seconds
$dnsResolverTimeout = 3;

$retrieveComments = false;

?>

--- lookup.php ( revision 2185 ) +++ lookup.php (working copy) @@ -15,6 +15,12 @@ $HTTP_RAW_POST_DATA = file_get_contents("php://input"); if(isset($HTTP_RAW_POST_DATA)) {

nicobubulle commented 10 years ago

From novik65 on September 30, 2012 07:24:39

Once the socket timeouts, fread() returns empty or incomplete buffer.

Yes, it can return incomplete buffer. And you code will continue work with incorrect data. This is a not good thing, IMHO.

Anyway, thanks for your work. And sorry, i can test your code (and apply patch) ~ at the next week only.

nicobubulle commented 10 years ago

From karel.t...@gmail.com on September 30, 2012 09:32:06

This is a not good thing, IMHO.

Waiting for all the data defeats the purpose of rapid lookups. If you're not content with fast resolver which occassionaly misses a reply due to timeout, simply don't set 'dnsResolver' in config - slow (but correct) resolver will be used instead :)

Btw - man 7 udp. It states the packets are either as a whole, or nothing. There is no such thing as receiving "half" of udp packet, even in presence of timeouts.

nicobubulle commented 10 years ago

From novik65 on October 08, 2012 01:49:15

1) Can't see any changes in the results. 10 peers, 9 hosts. 2) Some results may be strange. See line with 222.254.51.182.

[08.10.12 12:50:49] Peers count ($idx): 10 [08.10.12 12:50:49] 188.230.117.173 - ip-188-230-117-173.airbites.net.ua [08.10.12 12:50:49] 71.227.162.132 - c-71-227-162-132.hsd1.wa.comcast.net [08.10.12 12:50:49] 121.84.45.184 - 121-84-45-184f1.hyg2.eonet.ne.jp [08.10.12 12:50:49] 184.15.31.162 - 184-15-31-162.dr01.chtn.wv.frontiernet.net [08.10.12 12:50:49] 67.61.228.177 - 67-61-228-177.cpe.cableone.net [08.10.12 12:50:49] 82.228.0.251 - maa78-1-82-228-0-251.fbx.proxad.net [08.10.12 12:50:49] 84.25.220.96 - 5419DC60.cm-5-2d.dynamic.ziggo.nl [08.10.12 12:50:49] 222.254.51.182 - localhost [08.10.12 12:50:49] 85.229.8.72 - c-4808e555.024-135-73746f36.cust.bredbandsbolaget.se

nicobubulle commented 10 years ago

From novik65 on October 08, 2012 02:07:13

But if i comment line

if (substr($buf, $pos, 10) != "\0\x0C\0\1\xC0\x0C\0\x0C\x00\x01") continue;

then all OK. In the 'continue' case buffer contain "\0\x0C\0\1\xC0\x0E\0\x06\x00\x01".

nicobubulle commented 10 years ago

From karel.t...@gmail.com on October 12, 2012 10:55:43

ad 1)

I've used breakdown in http://www.netfor2.com/dns.htm for reference.

\0\x0C\0\1\xC0\x0C\0\x0C\x00\x01 - start at offset 12, rrtype PTR, of class 1 (ipv4?) \0\x0C\0\1\xC0\x0E\0\x06\x00\x01 - means at offset 14, rrtype CNAME, of class 1

Please confirm on a linux box that IP is indeed a CNAME host via:

$ dig -x

As stated in the first post, CNAME is not implemented. Because the protocol becomes complex at that point (ptr reply may, or may not be in the same packet - in which case next query/response cycle recursively invoked until we get PTR).

Removing that continue statement is extremely dangerous, because of high possibility of passing junk to clients. It's better to not resolve the ip at all, than return outright garbage.

Probably it should be noted in conf.php that the resolver is somewhat limited.

As for 2): $ host 222.254.51.182 182.51.254.222.in-addr.arpa domain name pointer localhost.

Nothing wrong at all. gethostbyaddr() will not bother verifying the results either, nor i think there is need to.

nicobubulle commented 10 years ago

From karel.t...@gmail.com on October 12, 2012 11:08:01

Also you can try this variant (decreasing idx only if the continue condition passes). I suspect it might work, but slow things down because there will be case of waiting for a packet which is never going to arrive.

nicobubulle commented 10 years ago

From novik65 on October 15, 2012 00:38:12

Also you can try this variant (decreasing idx only if the continue condition passes)

This variant works correctly.

Status: Fixed