thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
221 stars 54 forks source link

c8y-remote-access-plugin does not resolved .local addresses #2803

Closed reubenmiller closed 3 weeks ago

reubenmiller commented 7 months ago

Describe the bug

The c8y.proxy does not seem to support the resolution of domain names with the .local suffix, though other components such as (ping, curl, wget) do support the resolution of .local host names.

Given the main device, rpi5-d83addab8e9f, the c8y-remote-access-plugin is used to try to access a child device rpizero2-d83add42f121.local.

# Fails
$ c8y-remote-access-plugin 530,rpi5-d83addab8e9f,rpizero2-d83add42f121.local,22,dd4afed9-37cc-4d23-a923-34612bf628e4
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value:   × Failed to connect to TCP socket
  ╰─▶ failed to lookup address information: Name does not resolve
', crates/core/tedge/src/main.rs:50:58
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

However other linux tools on the same device (such as ping, curl, wget) do support the resolution of the .local names:

$ ping rpizero2-d83add42f121.local
PING rpizero2-d83add42f121.local (192.168.68.52) 56(84) bytes of data.
64 bytes from rpizero2-d83add42f121 (192.168.68.52): icmp_seq=1 ttl=64 time=11.9 ms
64 bytes from rpizero2-d83add42f121 (192.168.68.52): icmp_seq=2 ttl=64 time=18.3 ms

To Reproduce

This behaviour can be easily reproduced by calling the c8y-remote-access-plugin command. The command is not expected to successfully connect, however it should show that the name resolution does not work.

Given a main device with thin-edge.io installed on it,

  1. Try to start the c8y remote access plugin from the command line.

    sudo c8y-remote-access-plugin "530,rpi5-d83addab8e9f,<child_device>.local,22,dd4afed9-37cc-4d23-a923-34612bf628e4"

Expected behavior

If a device supports name resolution of .local domains, then thin-edge.io should also support it.

Screenshots

Environment (please complete the following information):

Property Value
OS [incl. version] Debian GNU/Linux 12 (bookworm)
Hardware [incl. revision] Raspberry Pi 5 Model B Rev 1.0
System-Architecture Linux rpi5-d83addab8e9f 6.1.0-rpi7-rpi-2712 #1 SMP PREEMPT Debian 1:6.1.63-1+rpt1 (2023-11-24) aarch64 GNU/Linux
thin-edge.io version tedge 1.0.1

Additional context

This might be a general problem with the default name resolver used by thin-edge.io, so I wouldn't be surprised if other components also have the same problem.

reubenmiller commented 7 months ago

After some investigation it seems that the musl builds don't support the Name Service Switch:

It can be demonstrated easily using the tedge mqtt sub cli command.

Set the mqtt.client.host to a mdns name which can be resolved in your network:

sudo tedge config set mqtt.client.host rpi5-d83addab8e9f.local

Then using the musl target (e.g. aarch64-unknown-linux-gnu), subscribe to any topic:

# ldd $(which tedge)
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
    not a dynamic executable
# tedge mqtt sub 'dummy'
ERROR: I/O: failed to lookup address information: Name does not resolve
ERROR: I/O: failed to lookup address information: Name does not resolve
^C

Then build a libc variant, e.g. aarch64-unknown-linux-gnu, and so the same (this should work as libc uses NSS):

# ldd ./tedge
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
    linux-vdso.so.1 (0x00007fff18cf0000)
    libm.so.6 => /lib/aarch64-linux-gnu/libm.so.6 (0x00007fff180d0000)
    libpthread.so.0 => /lib/aarch64-linux-gnu/libpthread.so.0 (0x00007fff180a0000)
    libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x00007fff17ee0000)
    libdl.so.2 => /lib/aarch64-linux-gnu/libdl.so.2 (0x00007fff17eb0000)
    /lib/ld-linux-aarch64.so.1 (0x00007fff18cb8000)
# ./tedge mqtt sub 'dummy'
INFO: Connected
reubenmiller commented 7 months ago

In general the lack of the NSS support is also not just affecting name resolution, but also user/groups. I suspect the usage of a musl build is also the root cause for this ticket as well https://github.com/thin-edge/thin-edge.io/issues/2042

reubenmiller commented 6 months ago

Using another resolver like (trust-dns/hickory-dns) might also improve the reliability of name resolution in the case of misconfiguration (as seen in this ticket https://github.com/thin-edge/thin-edge.io/issues/2321)

Bravo555 commented 6 months ago

I've got the following working with mDNS:

root@rpi4-d83add9826d7:~/musl-mdns-test# tedge config set mqtt.client.host ubuntu-20.local
root@rpi4-d83add9826d7:~/musl-mdns-test# tedge mqtt sub "#"
INFO: Connected

I found that on my Raspberry Pi 4 with rugpi image, mDNS lookup indeed doesn't work on musl builds of Rust binaries. But mDNS lookup worked on musl builds on my x86 dev machine, even though musl shouldn't support it!

So upon closer inspection, contents of /etc/resolv.conf files were different between the two machines.

127.0.0.53 is a stub resolver provided by systemd-resolved and is meant to be used by applications that don't use the DBus API for resolving, or don't go through glibc for resolving, but just do a plain DNS lookup.

So systemd-resolved will resolve mDNS hosts, and provide appropriate entries in the stub DNS resolver, which can then be used by musl binaries.

To use systemd-resolved, I had to disable NetworkManager and replace it with systemd-networkd. But apparently systemd-resolved can work with NetworkManager, so I will try it next so we don't have to replace NetworkManager alltogether.

zhong-ys commented 4 months ago

Experienced the same issue when requesting logs, config files.

Hardware: Raspberry pi 3 OS: Debian GNU/Linux 12 (bookworm) tedge version: tedge 1.1.2~71+g0d49756 C8Y instance: Edge on k8s version 10.18

Temporary solution: edited /etc/hosts from 127.0.1.1 raspberrypi to 127.0.1.1 raspberrypi raspberrypi.local

reubenmiller commented 4 months ago

@zhong-ys The recommended way to solve this problem is to use systemd-resolved which can provide the mdns-sd name resolution.

It isn't recommended to add a manual entry for .local as this will bypass the mdns functionality...and this is also impractical when communicating with other devices in the local network as you'd need one entry per host.

reubenmiller commented 3 weeks ago

To show an example of how to install and configure systemd-resolved in a real world scenario, we're extending the rugpi to include a new systemd-resolved recipe which provides a DNS stub which allows other applications (not only thin-edge.io) to resolve hostnames via the .local domain.

The recipe has been added to the upcoming v0.8 image and we'll cut a release once thin-edge.io 1.3.1 is release (due this week).

But since we won't be supporting mdns in the binary itself, I'll consider this ticket closed as "won't do".