Closed ShredderCody closed 5 months ago
vlan br network on unraid
Just curious, as I'm also on unraid and this caught my attention. Are you using macvlan or ipvlan? do a docker network ls
to check.
Are you using macvlan or ipvlan?
I'm on ipvlan.
root@ServerOfThings:~# docker network ls
NETWORK ID NAME DRIVER SCOPE
3d56a2f76549 br0 ipvlan local
738c523b350e br0.33 ipvlan local
614299d3100f br0.66 ipvlan local
141d492bac7e bridge bridge local
3f38d38cd9b7 host host local
4395f5a5a799 none null local
root@ServerOfThings:~#
For reference the node is running on the br0.66 vlan.
Interesting. I'm running macvlan and have most of my service docker containers running on their own IP.
There's part of the startup script I had to include that kills anything running on port 3000 because node wouldn't start, but with it being an empty OS, I couldn't figure out what was running to begin with (probably where that supervisor error is coming from). It SOUNDS like I can just include user=root
in the config and it might work for you.
I'll need to set up a dev unraid to do this against I think though. I'm too afraid of breaking my internal system messing with the networking too much ;) is this a blocker for you @ShredderCody ? Can you still use the app or is it busted in your config?
is this a blocker for you @ShredderCody ? Can you still use the app or is it busted in your config?
Its seems like just the webui fails, the monero node seems to work just fine as far as I can tell. I'll help troubleshoot the best I can on my setup.
It SOUNDS like I can just include
user=root
in the config and it might work for you.
I added user=root
to the bitmonero.conf in /mnt/user/appdata/monero-nodeboard and restarted the container. This causes the same error as well as the node now failing to start. My assumption is the failure to start has to do with the "Unrecognized option 'user' in config file." error. I've attached the full log as well as my bitmonero.conf.
Full log:
/usr/lib/python3/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
self.warnings.warn(
> monero-dashboard@1.1.0 start
> npm install --production && node server/index.js
2024-04-13 03:18:42,424 INFO exited: app (exit status 1; not expected)
2024-04-13 03:18:42,426 INFO spawned: 'app' with pid 32
2024-04-13 03:18:43,427 INFO success: app entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
npm ERR! code EAI_AGAIN
npm ERR! syscall getaddrinfo
npm ERR! errno EAI_AGAIN
npm ERR! request to https://registry.npmjs.org/kill-port failed, reason: getaddrinfo EAI_AGAIN registry.npmjs.org
npm ERR! A complete log of this run can be found in:
npm ERR! /root/.npm/_logs/2024-04-13T03_18_42_704Z-debug-0.log
---------------------------------true---------------------------
-----------------------------------rpc-restricted-bind-port 18089 --rpc-restricted-bind-ip 0.0.0.0 --public-node---------------------------
Unrecognized option 'user' in config file.
> monero-dashboard@1.1.0 start
> npm install --production && node server/index.js
2024-04-13 03:20:17,644 INFO exited: app (exit status 1; not expected)
bitmonero.conf:
data-dir=/app/blockchain
rpc-use-ipv6=true
rpc-bind-ip=0.0.0.0
rpc-bind-port=18081
rpc-bind-ipv6-address=::0
confirm-external-bind=true
enable-dns-blocklist=true
user=root
@ShredderCody so it looks like this is MIGHT be a DNS issue.
The startup command npx -y kill-port 3000
that is supposed to kill node is actually going out to npm and pulling down the kill-port
package to run it.
Could you connect to the terminal in the container and try to hit things outside of your network? It should be grabbing something from DHCP, but if you have this in a vlan maybe something there is excluding the container for some reason.
Keep in mind I'm basing this solely on stack FetchError: request to https://registry.npmjs.org/kill-port failed, reason: getaddrinfo EAI_AGAIN registry.npmjs.org
--- this whole line seems shady, and it would explain why the node comes up and can get outside connections (because that doesn't require DNS).
@ShredderCody so it looks like this is MIGHT be a DNS issue.
Definitely a DNS issue. The issue was two-fold. On the DHCP server the VLAN is set to use the DNS servers 1.1.1.1 and 1.0.0.1, however because I use IPvlans (still trying to figure out Macvlans) they don't use DHCP to get the DNS server to use. It defaults to using the DNS set on Unraid, which in my case is a local DNS that the container in that VLAN cannot reach. The solution was to manually set the DNS with --dns="1.1.1.1"
or a DNS it can actually reach. Thanks for the help!
@ShredderCody Just FYI, unraid is not supporting macvlan right now if you also have a bridge on your interface. (i.e. br0, br1) There is a bug and most are getting crashes with "call trace" errors. So on Unraid the only option is ipvlan right now. Works fine usually, but some networking kit may not like multiple IP's sharing the same mac, so consideration does need to be made for that.
If you really need macvlan (most probably don't), work arounds include not running a bridge, or running a VM with docker.
@samssausages funny thing is I've got macvlan running now, and I'm not seeing the call traces. I tried to switch it over and broke all of my dockers haha. I know just enough about networking to be dangerous, but not to fix this sort of thing.
After an upgrade I did have to rebuild it though, so they def don't really want you doing it.
@jnbarlow weird, most of the complaints on the unraid forums are for macvlan combined with a bridge. But it wouldn't surprise me! I used to run macvlan and all was fine, for about two years and through unraid updates. Then they started saying to switch to ipvlan, so I did. After a week I switched back, as I didn't have any crashes and prefer macvlan. Within a day of switching back I crashed with call traces error. So now I'm stuck on ipvlan... I thought it was just me, but your story makes me wonder.
I did see limetech (unraid) open a bug report for the underlying linux driver/kernel and they were able to reproduce it, so do hope for a fix eventually.
@samssausages macvlan in my situation is preferable but not required. I did see someone mention that that call traces depend on hardware, but I have no idea how true that is. I also depend in using bridging on my servers so maybe I will be ok to set up macvlan if I disable it.
@samssausages yeah, my config was from early on, bridge, macvlan, no issues... I've upgraded a few times and similar to @ShredderCody I tried to switch and broke everything so I went back. I was running into not being able to completely get rid of the old bridge, it was a mess. I've also not really been able to get ipv6 to work in the containers because of it.
@ShredderCody @jnbarlow Just for reference, here is the official bug report with more info on them recreating the error. https://bugzilla.kernel.org/show_bug.cgi?id=217777
@samssausages @ShredderCody That's crazy. I don't see any of these errors in my syslogs.
My unraid server has a bonded network, so maybe that's the difference? It's a bridge of a bridge?
@jnbarlow I would guess it's probably hardware or something else on the network, but I really don't know. I'm also running a bond and then a bridge on top. Using a Broadcom BCM57416 chipset
I also run a debian VM for docker on a different proxmox server, there I'm using a br0 with macvlan and no issues. (virtio for the vm NIC, Intel X722 on the host)
This seems to happen when using a vlan br network on unraid. A normal br network works fine. Only the access to the dashboard is affected, the node syncs and function normally. I have ruled out this being a network issue as I'm able to access all other containers on that Vlan because of a firewall rule allowing all of my traffic to that Vlan. Here is when the error takes place in the docker log.
Here is the 2024-04-12T13_56_34_686Z-debug-0.log as well.