SeattleTestbed / seash

Interactive vessel management tool
MIT License
0 stars 10 forks source link

NATted Affix-enabled seash often fails on NATted Affix-enabled vessels #70

Open choksi81 opened 10 years ago

choksi81 commented 10 years ago

With an Affix-enabled version of seash running behind a NAT, it is close to impossible to reach Affix-enabled vessels running behind other NATs. A typical sequence of events looks like this:

  1. Lookup BadSeash key in advertise server 128.238.63.15 --> value 95cb.....zenodotus.poly.edu:1224, c00c....
  2. Lookup in advertise server 128.238.63.51 --> value 95cb.....zenodotus.poly.edu:1224, c000c....
  3. Lookup in UDP advertise server 128.238.63.50 --> value 95cb... only!
  4. DNS query for 95cb... --> answer 192.168.1.130
  5. DNS query for c00c... --> answer 172.31.12.204
  6. Now we contact two NAT forwarders almost at the same time:
  7. Contact 120.216.1.23, get connection to 95cb..., GetVessels, response, everything good!
  8. Contact 128.59.20.226, ......, everything good!
  9. Contact c00c...'s private IP address --> fails of course
  10. Contact 95cb...'s private IP address --> fails of course

The last two steps are the problem -- these would only make sense if seash and the NATted vessels were on the same LAN (which is improbable in general).

I don't claim to know whether contacting the same node over two interfaces is desired behavior by Affix; we might also look at a different bug (e.g. look at the libraries seash_helper.py includes/dy_links: time, advertise, nmclient all use socket_timeout, but only the last has network calls overridden by Affix).

choksi81 commented 10 years ago

After #1407 is patched, this should no longer be an issue. #1404 is also related.