Closed x-cimo closed 11 years ago
In my case, the DNS server used doesn’t have much of an impact:
$ cat /etc/resolv.conf
# Generated by Connection Manager
nameserver 8.8.8.8
But I tried modifying /var/cache/resolv.conf
to use my ISP’s DNS servers, Google’s (above) and OpenDNS, it’s all the same.
But — apparently, this is something of a longstanding issue since glibc-2.10 when built with dual-stack support. See a discussion of this issue at Arch Linux forums, including further links and workarounds: https://bbs.archlinux.org/viewtopic.php?id=75770
Drepper dismisses this as broken DNS servers or firewalls (http://udrepper.livejournal.com/20948.html), but it seems to be more nuanced than that (https://fedoraproject.org/wiki/Networking/NameResolution/ADDRCONFIG or https://bugzilla.redhat.com/show_bug.cgi?id=505105 or any number of similar reports for Ubuntu and others). I’m seeing this behind RouterOS routers (which is usually a higher quality than your usual Linksys &c junk) and I’m not aware of any firewalling or filtering by the ISP — but of course that doesn’t say much, does it. I didn’t have any luck making Wireshark capture anything on another machine promiscuously so far to check this.
Supposedly what happens is that glibc sends both A and AAAA requests in the same query and some broken IPv6-unaware DNS servers don’t send any reply back. (I’m not sure about the firewall-eats-it argument, I think the likelihood of OpenDNS and Google being broken is quite low.) glibc solves this with a timeout, hence the delay. The timeout should only happen once, after which glibc knows the DNS server is problematic and won’t do this optimization again — but it happens once per process. So if thumbnails are fetched by dedicated processes launched for each thumbnail, that would explain the delays; it certainly explains ping google.com
delays.
The above thread suggests the following workarounds (if you don’t want to patch glibc):
options single-request
to /etc/resolv.conf
. I verified that this does help in my case.nscd
. This would help, because all DNS queries would pass through the single caching daemon process and only the first query would incur the timeout. This sounds like a generally good idea to do to me, regardless of this issue.I confirm that add options single-request works well! (ping + loading fanart images). Thanks for solving this nasty glibc bug. And yes, i use RouterOS router like @vslavik does, but i dont think, it makes any difference.
And I use a Juniper SSG-5 for my router, not a crappy off the shelf piece of equipment for sure. I will try the single-request if I actually decide to dump XBMCbuntu for OpenELEC.
@stefansaraev ?
single-request is not an option for now.. 53b7c6000a could be a possible fix. please test
@dukeczech @vslavik if gai.conf does not fix the issue for you, please also test 0c5caa599f5982. or ping @sraue to make a testbuild but I believe jut patching gai.conf should be enough
for a quick gai.conf test without rebuilding. (changes lost on reboot):
cp -a /etc/ /tmp/
mount --bind /tmp/etc/ /etc/
edit /etc/gai.conf and add this:
# this is likely already present
precedence ::ffff:0:0/96 100
# this is important
scopev4 ::ffff:169.254.0.0/112 2
scopev4 ::ffff:127.0.0.0/104 2
scopev4 ::ffff:0.0.0.0/96 14
ensure to remove option single-request in resolv.conf while testing
and btw. nscd is not an option too
53b7c60 could be a possible fix
Sorry, no, like any other time, shotgun debugging doesn’t work this time either. gai.conf
rules describe ordering of the getaddrinfo()
result set, while the issue is with receiving the results before that.
please also test 0c5caa5.
…unlike this, which does touch the relevant code. But yes, a build to test would be handy (apparently the nighties are not built anymore?).
single-request is not an option ... and btw. nscd is not an option too
Thanks for that enlightening explanation.
53b7c60
doesnt fix this issue (i made build and also tried with "mount --bind" trick)0c5caa5
doesnt fix this eitherupdate: i changed my RouterOS router for d-link DIR-655 and that problem no longer appears. It seems to be related to some kind of combination RouterOS + eglibc dns NSS improvement:
The problem with this change was that there are broken DNS servers and broken firewall configurations which prevented the two results from being received successfully. Some broken DNS servers (especially those in cable modems etc) only send one reply. For this reason Fedora had this change disabled in F10.
FYI with 0c5caa5, if yoy have ONLY link-local address, there will be NO AAAA query at all. if you live in a real v6 environment and your host is (supposed to be) reachable from the net - this does not apply to you, and well.. you have a problem :)
FYI with 0c5caa5, if yoy have ONLY link-local address, there will be NO AAAA query at all.
I didn’t try it (unlike @dukeczech, who said above this does not help), but I very much doubt this patch helps either. In my configuration, I don’t even have link-local IPv6. The only IPv6-capable interface is the loopback:
# ifconfig
eth0 Link encap:Ethernet HWaddr 00:01:2E:23:52:14
inet addr:192.168.11.101 Bcast:192.168.11.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1707084 errors:0 dropped:9 overruns:0 frame:0
TX packets:349584 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2503820015 (2.3 GiB) TX bytes:30840773 (29.4 MiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:4 errors:0 dropped:0 overruns:0 frame:0
TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:288 (288.0 B) TX bytes:288 (288.0 B)
#
And you’ll notice that the code touched by this patch already used IN6_IS_ADDR_LOOPBACK
check before — in other words, the logic stays the same in my network’s configuration. Yet AAAA queries are sent.
And I suppose options single-request-reopen
“is not an option” either?
http://openelec.tv/news/20-project/102-new-unofficial-addon-repo here we have a "tcpdump" addon. test with / without 0c5caa5 and / or ipv6 fully disabled via extlinux.conf (append ipv6.disable=1) or sysctl net.ipv6.conf.*.disable_ipv6: 0
tcpdump -nNpvi eth0 port 53
to add any options to resolv.conf we have to patch connman. I would like to avoid this but make ipv6 support optional and disabled by default instead.
0c5caa5
& with broken router (Mikrotik RB150 with RouterOS)
0c5caa5
& with router (d-link DIR655)
0c5caa5
& with broken router (Mikrotik RB150 with RouterOS)
0c5caa5
& with router (d-link DIR655)
OpenELEC:~/.xbmc/userdata # ifconfig
eth0 Link encap:Ethernet HWaddr 80:EE:73:07:79:3B
inet addr:192.168.1.35 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2747 errors:0 dropped:0 overruns:0 frame:0
TX packets:1291 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:517852 (505.7 KiB) TX bytes:188761 (184.3 KiB)
Interrupt:44
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:140 errors:0 dropped:0 overruns:0 frame:0
TX packets:140 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:15208 (14.8 KiB) TX bytes:15208 (14.8 KiB)
0c5caa5
doesn't make any difference uh. huh. and you are sure it is built with 0c5caa5. the result here is a bit different:
no v6: http://sprunge.us/bJgc v6 link-local: http://sprunge.us/AcNL v6 http://sprunge.us/IZKU
EDIT: well. I am on master/glibc 2.18. will switch now to 3.2 branch and rebuild. for eglibc 2.17 the patch might be a bit different. will let you know when I am done and will provide a generic64 build for testing if it's ok for you
rebuilding eglibc only is enough
APPLY PATCH (common): /home/duke/openelec/openelec-3.2/packages/toolchain/devel/eglibc/patches/eglibc-fix-dns-with-broken-routers.patch
^^ this looks fine but you must do "PROJECT=xxx ARCH=yyy ./scripts/clean eglibc" before make release.
EDIT: ups. I didn't noticed the "> build.log" redirect. it is okay.
duke@intel-i7:~/openelec/openelec-3.2/build.OpenELEC-ION.x86_64-devel/eglibc-2.17-22321/sysdeps/unix/sysv/linux$ diff check_pf.c check_pf.c.orig
236,237c236
< if (!IN6_IS_ADDR_LOOPBACK (address) &&
< !IN6_IS_ADDR_LINKLOCAL (address))
---
> if (!IN6_IS_ADDR_LOOPBACK (address))
PROJECT=ION ARCH=x86_64
with patchdoing a clean build (x86_64 generic but will work on your box). it takes some time. will let you know when it's ready. thanks for your time and effort
@dukeczech can you please join irc and ping @sraue for a testing build?
DEBUG: GetImageHash - unable to stat url
errors)How to disable DNS AAAA queries?
, AAAA and A
, a way to disable AAAA lookups in the resolver
, libc6: resolver is broken for local IPv6 networking due to patch from 435646
)can you please do
curl google.com
and watch tcpdump output.
good ;) thank you for your time and testing. now I know for sure what is wrong and I can reproduce here. a propper fix could take some time.
patch from 67cf9779104 need testing. expected behaviour: v4 only: http://sprunge.us/MQhZ v4 + v6 link-local only: http://sprunge.us/XHGU v6 + v6: http://sprunge.us/MDOe
67cf977
finally fix this issue!The proper function of AI_ADDRCONFIG requires that:
1. The usual processing of all node-local and link-local names and addresses is preserved as long as the respective addresses are present.
2. The global name resolution is not affected by the existence or non-existence of node-local and link-local addresses.
3. IN AAAA DNS queries should not be transmitted from a node with no global IPv6 address, and vice versa: IN A queries should not to be transmitted from a node with no global IPv4 address.
Unfortunately, the current implementation of getaddrinfo() mostly follows the informational RFC 3493, which fails in both #1, #2, and partially in #3.
EDIT: i make 6to4 tunnel Hurricane Electric tunnel broker services to my lan network:
this fix is in 3.2.2 now. thanks for reporting and testing.
I have been struggling with OpenElec on all my PC. I am using the Generic build. The issue is that anything that need to be fetched from tvdb etc fail to fetch.
My install are clean OpenElec 3.0.3 Generic build
In about ~100 movies, maybe 5-10 fetched their content.
I have been documenting this in a thread here: http://openelec.tv/forum/72-xbmc/64460-most-movie-cover-and-artwork-background-don-t-load#74575
I have found out that it's related to DNS. Dns are slow to resolve even with 8.8.8.8
ping google.ca take 10 sec before starting. however ping -4 google.ca works perfectly.
I have disabled ipv6, tried static ips, set google dns nothing worked.
As a last resort, I added the IPs for movie db and other image provided that xbmc use in my .config/hosts.conf and IT WORKED.
All my cover, preview, art work all started to load right away.
Here is what I added in hosts.conf 204.246.169.111 cf2.imgobject.com 204.246.169.231 cf1.imgobject.com 204.246.169.83 cf3.imgobject.com 190.93.253.95 thetvdb.com
Obviously hardcoding IP is bad... Does anyone have an idea what could be going wrong?