Open compiaffe opened 8 months ago
Any insights here?
I wonder if trying v15.2.0 will make a difference. For reference, here is a gist I use occasionally to preload a different Supervisor version into an OS. It may be useful to you as I'm not sure if Supervisor upgrades are available in openBalena.
We are still seeing this issue.
We have tried v15.2.0 (and later versions), with it's mDNS fixes but we are getting the same result.
The last supervisor version that correctly resolves DNS queries is v14.0.8. Anything after that, (starting at 14.0.13), does not resolve DNS queries.
Looking at the diffs between v14.0.8 and v14.0.13 the only change that seems kinda relevant is the removal of avahi-daemon
(and respective configs) from the supervisor container image, but I don't see exactly how that could be causing the issue.
Any help? Thanks in advance
Could you please try the latest v16.3.6 version? This was probably fixed in https://github.com/balena-os/balena-supervisor/pull/2311/commits/6f02b17968d02c2e27b523e40a25ef4c4815d20a.
Thanks for the tip @alexgg we will try it and report back
Even in the latest version we see the same issue inside the container:
INFO: Found device /dev/mmcblk0p1 on current boot device mmcblk0, using as mount for '(resin|balena)-boot'.
INFO: Found device /dev/mmcblk0p5 on current boot device mmcblk0, using as mount for '(resin|balena)-state'.
INFO: Found device /dev/mmcblk0p6 on current boot device mmcblk0, using as mount for '(resin|balena)-data'.
[info] Supervisor v16.3.5 starting up...
[info] Setting host to discoverable
[debug] Starting systemd unit: avahi-daemon.service
[debug] Starting systemd unit: avahi-daemon.socket
[debug] Starting logging infrastructure
[info] Starting firewall
[warn] Invalid firewall mode: . Reverting to state: off
[info] Applying firewall mode: off
[success] Firewall mode applied
[debug] Starting api binder
[debug] Performing database cleanup for container log timestamps
[info] Previous engine snapshot was not stored. Skipping cleanup.
[debug] Handling of local mode switch is completed
(node:1) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use node --trace-deprecation ... to show where the warning was created)
[info] API Binder bound to: https://api.aivero.lan/v6/
[event] Event: Supervisor start {}
[info] Starting API server
[info] Supervisor API successfully started on port 48484
[debug] Ensuring device is provisioned
[debug] Connectivity check enabled: true
[debug] Starting periodic check for IP addresses
[info] Waiting for connectivity...
[event] Event: Device bootstrap {}
[info] VPN connection is not active.
[info] New device detected. Provisioning...
[success] Initialised splash image backend
[info] Reporting initial state, supervisor version and API info
[info] Attempting to load any preloaded applications
[event] Event: Device bootstrap failed, retrying {"delay":30000,"error":{"message":"getaddrinfo EAI_AGAIN api.aivero.lan","stack":"Error: getaddrinfo EAI_AGAIN api.aivero.lan\n at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:118:26)"}}
[event] Event: Device bootstrap {}
[info] New device detected. Provisioning...
[event] Event: Device bootstrap failed, retrying {"delay":30000,"error":{"message":"getaddrinfo EAI_AGAIN api.aivero.lan","stack":"Error: getaddrinfo EAI_AGAIN api.aivero.lan\n at GetAddrInfoReqWrap.onlookupall [as oncomplete] (node:dns:118:26)"}}
But balenaOS solves the api.aivero.lan just fine
@alexgg any idea how to further debug this?
hey @compiaffe bring the issue through our support channels like the forums, and provide us with a reproduction.
@alexgg
We already have: https://forums.balena.io/t/supervisor-fails-to-resolve-dns-on-v4-v5-in-offline-air-gapped-setup-using-open-balena/369796
Reproduction is outstanding.
We deploy open-balena to an air-gapped network where the router resolves all the required balena domains: e.g. api.aivero.lan and advertises that DNS server via DHCP.
We
balena os configure
RaspberryPi3 with balenaOS v2.80.3 and these connect nicely even in an air-gapped network.However, these old images don’t have the fixed/updated HQ camera sensor-mode 5 1 so we need a newer version.
However, the newest v5.0.8, or v2.115.18+rev2 versions do not connect to open balena. The supervisor cannot resolve the domain. The HostOS however does.
The supervisors errors with getaddrinfo EAI_AGAIN api.aivero.lan:
We also tried adding a dnsServers: "null" entry to config.json to disable the automatic injection of 8.8.8.8 into the list of DNS servers. In certain cases having 8.8.8.8 caused a timeout waiting on a response from this server which is not reachable due to our air-gapped network. However, this had no effect here.
We found that the latest openBalena version for RaspberryPi3 that has the HQ camera fix AND connects correctly is the
v2.94.4
For the RaspberryPi4 we are using
v2.88.4+rev0
which has both the HQ fix AND connects correctly.There might be a connection to https://github.com/balena-os/balena-supervisor/issues/1335
How do we get the v5 version of balenaOS connecting correctly?
FYI, also posted here: https://forums.balena.io/t/supervisor-fails-to-resolve-dns-on-v4-v5-in-offline-air-gapped-setup-using-open-balena/369796