rosetta-home / mdns

basic mdns server and discovery client
Other
46 stars 22 forks source link

Discovery seems to take two or more tries to work #2

Open fhunleth opened 7 years ago

fhunleth commented 7 years ago

This results in a timeout the first time that you try to contact the device running the mdns library. The second or third time seem to be ok.

fhunleth commented 7 years ago

@mobileoverlord @tmecklem Do either you have anything to add? I'm trying to capture the conversation on July 12th regarding this issue since others are seeing it as well.

tmecklem commented 7 years ago

That captures it pretty well. I'm willing to do more investigation or capture logs if needed. I'm working with this (and experiencing it) quite a bit now.

fhunleth commented 7 years ago

Believe it or not, I'm not sure that I'm seeing it on Linux. Either that or I'm getting really lucky.

entone commented 7 years ago

I'm wondering if you guys can capture the IP address of the node when it's not working and when it is working. There's a TTL setting for the IP registration, I'm wondering if there may be a cache issue from previous deployment attempts.

I'm on Linux as well and will start investigating. The "one" test for the mDNS library "should" fail if this was an issue, but that does run locally, so maybe certain routers are at fault.

entone commented 7 years ago

@tmecklem could you run the mix test in you local dev environment?

entone commented 7 years ago

I've updated the tests to be a bit more thorough

tmecklem commented 7 years ago
System Software Overview:

  System Version:   macOS 10.12.5 (16F73)
  Kernel Version:   Darwin 16.6.0
  Boot Volume:  Macintosh HD
  Boot Mode:    Normal
  Computer Name:    xxxxxx
  User Name:    xxxxxx
  Secure Virtual Memory:    Enabled
  System Integrity Protection:  Enabled
  Time since boot:  1 day 5:18

Hardware Overview:

  Model Name:   MacBook Pro
  Model Identifier: MacBookPro11,2
  Processor Name:   Intel Core i7
  Processor Speed:  2 GHz
  Number of Processors: 1
  Total Number of Cores:    4
  L2 Cache (per Core):  256 KB
  L3 Cache: 6 MB
  Memory:   8 GB
  Boot ROM Version: MBP112.0138.B25
  SMC Version (system): 2.18f15
13:01 $ mix test
warning: variable "address" is unused
  test/mdns_test.exs:45

warning: variable "char_host" is unused
  test/mdns_test.exs:45

warning: variable "address" is unused
  test/mdns_test.exs:56

13:01:03.622 [debug] Testing Server

13:01:03.625 [debug] [flags: [:up, :loopback, :running, :multicast], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}, addr: {65152, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

13:01:03.626 [debug] [flags: [:pointtopoint, :multicast]]

13:01:03.626 [debug] [flags: []]

13:01:03.626 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [120, 49, 193, 198, 55, 92], addr: {65152, 0, 0, 0, 7295, 44286, 11481, 20117}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}, addr: {10, 20, 30, 37}, netmask: {255, 255, 255, 0}, broadaddr: {10, 20, 30, 255}]

13:01:03.626 [debug] {65152, 0, 0, 0, 7295, 44286, 11481, 20117}

13:01:03.648 [debug] Address: {65152, 0, 0, 0, 7295, 44286, 11481, 20117}

13:01:03.648 [debug] Hostname: usVTVKQYsw.local

  1) test client and server (MdnsTest)
     test/mdns_test.exs:26
     ** (exit) exited in: GenServer.call(Mdns.Server, {:start, []}, 5000)
         ** (EXIT) an exception was raised:
             ** (MatchError) no match of right hand side value: {:error, :eaddrinuse}
                 (mdns) lib/mdns/server.ex:66: Mdns.Server.handle_call/3
                 (stdlib) gen_server.erl:636: :gen_server.try_handle_call/4
                 (stdlib) gen_server.erl:665: :gen_server.handle_msg/6
                 (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
     stacktrace:
       (elixir) lib/gen_server.ex:737: GenServer.call/3
       test/mdns_test.exs:33: (test)

Finished in 0.09 seconds
1 test, 1 failure

Randomized with seed 556480

13:01:03.668 [error] GenServer Mdns.Server terminating
** (MatchError) no match of right hand side value: {:error, :eaddrinuse}
    (mdns) lib/mdns/server.ex:66: Mdns.Server.handle_call/3
    (stdlib) gen_server.erl:636: :gen_server.try_handle_call/4
    (stdlib) gen_server.erl:665: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message (from #PID<0.166.0>): {:start, []}
State: %Mdns.Server.State{ip: {0, 0, 0, 0}, services: [], udp: nil}
Client #PID<0.166.0> is alive
    (stdlib) gen.erl:169: :gen.do_call/4
    (elixir) lib/gen_server.ex:734: GenServer.call/3
    test/mdns_test.exs:33: MdnsTest."test client and server"/1
    (ex_unit) lib/ex_unit/runner.ex:302: ExUnit.Runner.exec_test/1
    (stdlib) timer.erl:166: :timer.tc/1
    (ex_unit) lib/ex_unit/runner.ex:250: anonymous fn/3 in ExUnit.Runner.spawn_test/3
entone commented 7 years ago

Ok, to run mDNS locally on OSX you need to kill the default mDNS server running locally. In OS X, you can disable it by opening terminal and typing the following:

launchctl unload /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist launchctl unload /System/Library/LaunchDaemons/com.apple.mDNSResponderHelper.plist

entone commented 7 years ago

The test will probably still not work, as the address function is returning your ipv6 address, and I specifically look for the ipv4. Will try to fix.

entone commented 7 years ago

Address function in the test is updated to return ipv4 addresses

aadavids commented 7 years ago

I'm one of the others seeing something similar, except I haven't been able to contact the device at all. This in Linux (Ubuntu 16.10). Here's my mix test output in my dev env:

$ mix test
warning: variable "address" is unused
  test/mdns_test.exs:46

warning: variable "char_host" is unused
  test/mdns_test.exs:46

warning: variable "address" is unused
  test/mdns_test.exs:57

15:40:02.072 [debug] Testing Server

15:40:02.073 [debug] [flags: [:up, :loopback, :running], hwaddr: [0, 0, 0, 0, 0, 0], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}]

15:40:02.073 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [68, 28, 168, 228, 46, 205], addr: {10, 0, 1, 143}, netmask: {255, 255, 255, 0}, broadaddr: {10, 0, 1, 255}, addr: {65152, 0, 0, 0, 60569, 35875, 46618, 58212}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

15:40:02.073 [debug] {10, 0, 1, 143}

15:40:02.084 [debug] Address: {10, 0, 1, 143}

15:40:02.084 [debug] Hostname: dcFIOwMwed.local

15:40:02.089 [debug] {:error, :nxdomain}

  1) test client and server (MdnsTest)
     test/mdns_test.exs:27
     match (=) failed
     code:  {:ok, {:hostent, char_host, [], :inet, 4, [address]}} = lookup
     right: {:error, :nxdomain}
     stacktrace:
       test/mdns_test.exs:46: (test)

Finished in 0.06 seconds
1 test, 1 failure

Randomized with seed 21530
entone commented 7 years ago

Odd that a hostname lookup would fail in .005 seconds. The mDNS server doesn't even log the event on my test for .1 seconds.

15:29:53.363 [debug] Testing Server

15:29:53.364 [debug] [flags: [:up, :loopback, :running], hwaddr: [0, 0, 0, 0, 0, 0], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}]

15:29:53.364 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [128, 250, 91, 28, 72, 194]]

15:29:53.364 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [172, 253, 206, 104, 164, 192], addr: {192, 168, 1, 112}, netmask: {255, 255, 255, 0}, broadaddr: {192, 168, 1, 255}, addr: {65152, 0, 0, 0, 58013, 57918, 36336, 23701}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

15:29:53.364 [debug] {192, 168, 1, 112}

15:29:53.375 [debug] Address: {192, 168, 1, 112}

15:29:53.375 [debug] Hostname: dg5DoV1zN1.local

15:29:53.481 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.482 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.482 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.482 [debug] {:ok, {:hostent, 'dg5DoV1zN1.local', [], :inet, 4, [{192, 168, 1, 112}]}}

15:29:53.482 [debug] Testing Client

15:29:53.482 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.482 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.483 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.483 [debug] Got Response: %DNS.Record{anlist: [%DNS.Resource{bm: [], class: :in, cnt: 0, data: {192, 168, 1, 112}, domain: 'dg5DoV1zN1.local', func: false, tm: :undefined, ttl: 10, type: :a}], arlist: [], header: %DNS.Header{aa: true, id: 0, opcode: :query, pr: false, qr: true, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: []}

15:29:53.483 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: '_nerves._tcp.local', type: :ptr}]}

15:29:53.483 [debug] Got Response: %DNS.Record{anlist: [%DNS.Resource{bm: [], class: :in, cnt: 0, data: {192, 168, 1, 112}, domain: 'dg5DoV1zN1.local', func: false, tm: :undefined, ttl: 10, type: :a}], arlist: [], header: %DNS.Header{aa: true, id: 0, opcode: :query, pr: false, qr: true, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: []}

15:29:53.483 [debug] Got Response: %DNS.Record{anlist: [%DNS.Resource{bm: [], class: :in, cnt: 0, data: '_rosetta._tcp.local', domain: '_nerves._tcp.local', func: false, tm: :undefined, ttl: 10, type: :ptr}], arlist: [], header: %DNS.Header{aa: true, id: 0, opcode: :query, pr: false, qr: true, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: []}
.
15:29:53.483 [debug] Device: {:"_nerves._tcp.local", %Mdns.Client.Device{domain: nil, ip: {192, 168, 1, 112}, payload: %{}, services: ["_rosetta._tcp.local", "_nerves._tcp.local"]}}
entone commented 7 years ago

Can you modify your test to not include the .local in the host name? and report back

entone commented 7 years ago

There's also this bug for 16.10, https://askubuntu.com/questions/837982/how-to-configure-local-dns-lookup-in-ubuntu-16-10#838395

Oh mDNS, the ugly duckling of the network world, so much promise...

aadavids commented 7 years ago

Yes, I was wondering if it was an issue with my environment. Here's the modified test:

$ mix test
warning: variable "address" is unused
  test/mdns_test.exs:46

warning: variable "char_host" is unused
  test/mdns_test.exs:46

warning: variable "address" is unused
  test/mdns_test.exs:57

16:41:33.399 [debug] Testing Server

16:41:33.401 [debug] [flags: [:up, :loopback, :running], hwaddr: [0, 0, 0, 0, 0, 0], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}]

16:41:33.401 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [68, 28, 168, 228, 46, 205], addr: {10, 0, 1, 143}, netmask: {255, 255, 255, 0}, broadaddr: {10, 0, 1, 255}, addr: {65152, 0, 0, 0, 60569, 35875, 46618, 58212}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

16:41:33.401 [debug] {10, 0, 1, 143}

16:41:33.410 [debug] Address: {10, 0, 1, 143}

16:41:33.410 [debug] Hostname: SnzOJR7UJG

16:41:34.344 [debug] {:error, :nxdomain}

  1) test client and server (MdnsTest)
     test/mdns_test.exs:27
     match (=) failed
     code:  {:ok, {:hostent, char_host, [], :inet, 4, [address]}} = lookup
     right: {:error, :nxdomain}
     stacktrace:
       test/mdns_test.exs:46: (test)

Finished in 0.9 seconds
1 test, 1 failure

Randomized with seed 350735
entone commented 7 years ago

Ok, same issue. Are you able to ping any other local machines on your network?

aadavids commented 7 years ago

Update: I tried reinstalling libnss-resolve (sudo apt purge libnss-resolve and sudo apt install libnss-resolve), and now the unmodified tests pass.

aadavids commented 7 years ago
$ mix test                                                                                                                                                      
warning: variable "address" is unused
  test/mdns_test.exs:46

warning: variable "char_host" is unused
  test/mdns_test.exs:46

warning: variable "address" is unused
  test/mdns_test.exs:57

16:47:58.365 [debug] Testing Server

16:47:58.366 [debug] [flags: [:up, :loopback, :running], hwaddr: [0, 0, 0, 0, 0, 0], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}]

16:47:58.367 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [68, 28, 168, 228, 46, 205], addr: {10, 0, 1, 143}, netmask: {255, 255, 255, 0}, broadaddr: {10, 0, 1, 255}, addr: {65152, 0, 0, 0, 60569, 35875, 46618, 58212}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

16:47:58.367 [debug] {10, 0, 1, 143}

16:47:58.376 [debug] Address: {10, 0, 1, 143}

16:47:58.376 [debug] Hostname: QWecd3h2EW.local

16:47:58.497 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'QWecd3h2EW.local', type: :a}]}

16:47:58.505 [debug] {:ok, {:hostent, 'QWecd3h2EW.local', [], :inet, 4, [{10, 0, 1, 143}]}}

16:47:58.505 [debug] Testing Client

16:47:58.506 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: '_nerves._tcp.local', type: :ptr}]}

16:47:58.507 [debug] Got Response: %DNS.Record{anlist: [%DNS.Resource{bm: [], class: :in, cnt: 0, data: '_rosetta._tcp.local', domain: '_nerves._tcp.local', func: false, tm: :undefined, ttl: 10, type: :ptr}], arlist: [], header: %DNS.Header{aa: true, id: 0, opcode: :query, pr: false, qr: true, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: []}
.
16:47:58.507 [debug] Device: {:"_nerves._tcp.local", %Mdns.Client.Device{domain: nil, ip: {10, 0, 1, 143}, payload: %{}, services: ["_rosetta._tcp.local", "_nerves._tcp.local"]}}

Finished in 0.1 seconds
1 test, 0 failures

Randomized with seed 314710
entone commented 7 years ago

excellent, thank you for testing this. hopefully that resolves your network deployment tests as well.

I always run an LTS version of Ubuntu, for the record.

aadavids commented 7 years ago

Yeah, sticking with LTS is probably the smarter thing to do :).

Unfortunately still unable to contact the device.

entone commented 7 years ago

can you run ping nerves.local or whatever the hostname is set to for the device.

are you sure the device is on the network?

Is this rpi0w?

aadavids commented 7 years ago

It's a BeagleBone black, running a port of "gadget mode" from the rpi0. So it's supposed to set up a connection over the USB interface.

I'm unable to ping the BBB from linux. Interestingly, I tried the same device with my mac at home, and it worked. The first ping packet always times out, but after that it's good. So the issue on linux might be unrelated.

fhunleth commented 7 years ago

@aadavids Hmm... Could you try a couple things on Linux? First, could you check that the virtual Ethernet interface that shows up on Linux is getting assigned a link local IP address? For example, run ifconfig -a in a terminal and look for a 169.254.0.0/16 address . The next thing to try is to run wireshark and capture all of the traffic on the virtual Ethernet interface. Wireshark labels packets, so you should see some mDNS packets go through. Or, maybe this is another Ubuntu 16.10 bug?

aadavids commented 7 years ago

Sorry, I won't have that much time to test things until Monday, but here's the network config:

enp0s20f0u1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::3d1b:5fa7:7b82:5256  prefixlen 64  scopeid 0x20<link>
        ether 1a:52:97:d4:b5:03  txqueuelen 1000  (Ethernet)
        RX packets 2  bytes 80 (80.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 45  bytes 6338 (6.3 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

So it looks like it's getting an ipv6 address?

entone commented 7 years ago

Hmm, if the virtual interface is only getting an ipv6 address, that could be an issue. I haven't tested mDNS with ipv6 and I'm pretty sure the rpi0 firmware only uses ipv4 as well.

fhunleth commented 7 years ago

I had thought that IPv4 link local addresses would be assigned by default unless they were explicitly disabled. It may be that the system is waiting for DHCP to time out before assigning an address?

Also, regarding rpi0 firmware only using ipv4, that's sadly correct. For some reason I had thought that it just wasn't used. Not that it will change much since applications need to support it, but I'll start moving it into the systems: https://github.com/nerves-project/nerves_system_rpi0/pull/19.

aadavids commented 7 years ago

I captured some of the traffic on the virtual Ethernet interface and I can see MDNS packets. I also tried to ping the device a couple times while wireshark was running. mdns.pcapng.tar.gz

So then I guess the issue seems to be that the virtual Ethernet interface is getting an ipv6 address?

fhunleth commented 5 years ago

Update: Since we switched away from using IPv4 link-local addressing on USB gadget Ethernet with the official Nerves systems, mDNS has been work very well. If anyone else runs into this and is running Nerves, definitely verify that you're not using link-local addresses on the connection.