Closed GoogleCodeExporter closed 9 years ago
So, I tried connecting to user1@a.jappix.com but the connection failed (as
expected). How should I reproduce this? I would like to be able to reproduce
the "hang" case that you are seeing.
Original comment by dhruvb...@gmail.com
on 5 Jun 2011 at 7:21
This is what we have discussed about 2 days ago by XMPP. You can reproduce it
making a DNS server replying a SRV entry exists, but not using the correct SRV
syntax (it was a CNAME reply in my case).
So I think validating the SRV reply syntax may fix this issue.
Original comment by vanaryon
on 5 Jun 2011 at 9:08
Do you have a domain on which this behaviour can be reproduced? I am using the
node.js DNS resolver, so unless I see what is exactly happening, it would be
hard to fix it. http://nodejs.org/docs/v0.4.7/api/dns.html#dns.resolveSrv
Original comment by dhruvb...@gmail.com
on 5 Jun 2011 at 9:22
I think if you try connecting to, let's say, stats.jappix.com using XMPP, it
will fail. This is not a XMPP domain, but there is the SRV bug because no SRV
entry is configured for "stats".
So the bug might be the same here ;)
Original comment by vanaryon
on 5 Jun 2011 at 9:55
Just checking with the basic.js test:
$> node basic.js --username="test@stats.jappix.com" --password=xx
This works fine and does not hang. What behaviour do you see when you run it?
This is the dig output I see. What do you see?
$> dig -t SRV _xmpp-client._tcp.stats.jappix.com
; <<>> DiG 9.7.3 <<>> -t SRV _xmpp-client._tcp.stats.jappix.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 1154
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;_xmpp-client._tcp.stats.jappix.com. IN SRV
;; AUTHORITY SECTION:
jappix.com. 1656 IN SOA a.dns.gandi.net. hostmaster.gandi.net. 1307095670
10800 3600 604800 10800
;; Query time: 109 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sun Jun 5 15:29:14 2011
;; MSG SIZE rcvd: 114
Original comment by dhruvb...@gmail.com
on 5 Jun 2011 at 10:00
Mhh that's pretty strange. You are right, using dig now I don't get any bad
reply on non-SRV domains.
2 days ago, I removed the *.jappix.com record. I think it was the bad SRV reply
reason.
Anyway, it was replying a "CNAME jappix.com.". But if you tell me the SRV
resolution is nodejs-dependant, we'd better leave this bug for NXB and report
it to nodejs, telling them that nodejs SRV module should process a regex check
of the reply.
Original comment by vanaryon
on 5 Jun 2011 at 10:07
Yep. Makes sense.
Original comment by dhruvb...@gmail.com
on 5 Jun 2011 at 10:14
Either ways, it would be nice if you can set up such a bad DNS entry on some
domain (or subdomain) so that it can be reported properly - showing the exact
failure case.
Original comment by dhruvb...@gmail.com
on 5 Jun 2011 at 10:15
Mhh, it may break the DNS file, because Gandi DNS are a bit strange with that.
I am looking for my command line logs, if I can found anything remaining ;)
Original comment by vanaryon
on 5 Jun 2011 at 10:19
I found our chatlogs, where the reply appears:
http://codingteam.net/public/muclogs/jappix@conference.codingteam.net/2011-06-02
.html#21:39:22
Original comment by vanaryon
on 5 Jun 2011 at 10:25
Did you have a * record configured at that time? If so, what did it point to?
(jappix.com)?
Original comment by dhruvb...@gmail.com
on 5 Jun 2011 at 10:30
It was a * 86400 IN CNAME jappix.com., and I believe it was the cause of the
issue (Gandi DNS servers are running Bind9, this bug is very strange for
Bind9!).
Original comment by vanaryon
on 5 Jun 2011 at 10:33
Which makes me wonder if it's a BIND bug more than a node bug!!
Original comment by dhruvb...@gmail.com
on 5 Jun 2011 at 10:52
I think node & BIND are buddy on that point.
BIND because it returns a bad answer
node because it does not filter the answer and detect it is wrong
Original comment by vanaryon
on 5 Jun 2011 at 10:57
Actually, node.js has a timeout of 3 mins. Since there were no DEBUG logs, it
wasn't apparent that the DNS SRV record resolution was timing out after 3 mins.
Added those. This should ensure that the client gets a termination notification
within 3 mins. (though it seems like a really long time - no idea how to reduce
it since the node.js DNS module doesn't provide any way of specifying it).
Original comment by dhruvb...@gmail.com
on 7 Jun 2011 at 2:27
Original issue reported on code.google.com by
vanaryon
on 5 Jun 2011 at 7:01