hyperboria / bugs

Peer-to-peer IPv6 networking, secure and near-zero-conf.
153 stars 17 forks source link

cjdroute restart on linux (but not osx) requires 60 seconds to re-establish connectivity #153

Closed glycerine closed 7 years ago

glycerine commented 7 years ago

I'm new to cjdns, which is a very exciting project. Apologies if this is a known issue; I'm also not sure where else to inquire.

I started cjdroute on two nodes, one OSX and one linux. The OSX is behind a NAT that my ISP uses.

Once both cjdroute nodes are started, I can ping6 one node from the other, and vice-versa, no problem. So far, excellent.

If I shutdown and restart the OSX cjdroute process, it is almost instantly back up. The time until the pings coming from the linux process (the linux node is on the open internet, not behind NAT) are able to ping through again suffers no delay.

However, if I kill and restart the linux cdjroute, it takes about 60 seconds before pings originating from the OSX node over cjdns are able to resume.

The situation is asymmetric. Perhaps it is an artifact of the NAT in between them then?

Is there any troubleshooting I could do, or known way to reduce the delay upon restart of the linux cjdroute process? I would like to be able to recover the use of the cjdns network quickly after restart.

The fact that the pings are repeatable and reliably paused for almost exactly 60 seconds suggests that there must be some kind of software defined timeout in use somewhere.

Thank you.

versions: osx version was from brew install cjdns cjdroute --version Cjdns version: unknown Cjdns protocol version: 20

linux version: cjdroute --version Cjdns version: cjdns-19.1-4.el7 Cjdns protocol version: 19

Ouch. Just noticed that the linux version is older. Not obvious to me if this would explain the pause though.


Only the OSX node has a "connectTo" stanza in the "UDPInterface" section. So the OSX node knows how to contact the linux node. But the Linux node has an empty "connectTo" stanza under "UDPInterface". This makes sense because the OSX node is behind the NAT and so doesn't have an external IPv4 address to list.

kpcyrd commented 7 years ago

This might be a delay until the OSX node realizes it's not connected anymore and reconnects. I'd recommend to monitor ./tools/peerStats on both nodes.

glycerine commented 7 years ago

hmm... is there some trick to running tools/peerStats? On OSX, it seems to timeout then crash. If this has an obvious fix, apologies, but I'm new to cjdns.

This is a different OSX instance, and I can't seem to bring cjdns up at all on this machine.

~/go/src/github.com/cjdelisle/cjdns (master) $ ./tools/peerStats

/Users/jaten/go/src/github.com/cjdelisle/cjdns/tools/lib/cjdnsadmin/cjdnsadmin.js:196
            if (err) { throw err; }
                       ^

Error: timeout after 10000ms
    at Timeout._onTimeout (/Users/jaten/go/src/github.com/cjdelisle/cjdns/tools/lib/cjdns\
admin/cjdnsadmin.js:27:18)
    at tryOnTimeout (timers.js:232:11)
    at Timer.listOnTimeout (timers.js:202:5)
 ~/go/src/github.com/cjdelisle/cjdns (master) $ git log|head
commit efd7d7f82be405fe47f6806b6cc9c0043885bc2e
Author: Caleb James DeLisle <cjd@cjdns.fr>
Date:   Sat Jun 24 10:39:41 2017 +0200

    Fix the issue raised in #1070 without possibility of name collisions

commit e146d960fd334d677028a708e1553f10900fa952
Merge: f3a1bd0 355d7d7
Author: Caleb James DeLisle <cjd@cjdns.fr>
...
glycerine commented 7 years ago

I'm going to mark this up to some kind of version skew, at least for now. Feel free to re-open if you can reproduce it.

Also basic installation had failed on osx. the secret steps to make cjdns work on osx are:

  1. install https://sourceforge.net/projects/tuntaposx/ first

  2. it is easy to build cjdns from source, git clone it then just use the ./do script.

  3. delete the ETHInterface stanza from the cjdroute.conf file before using it

  4. run cjdroute as root.