Failed to kill port 3000 results in inablity to access Dashboard

ShredderCody commented 5 months ago

This seems to happen when using a vlan br network on unraid. A normal br network works fine. Only the access to the dashboard is affected, the node syncs and function normally. I have ruled out this being a network issue as I'm able to access all other containers on that Vlan because of a firewall rule allowing all of my traffic to that Vlan. Here is when the error takes place in the docker log.

/usr/lib/python3/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
  self.warnings.warn(
2024-04-12 13:56:33,368 CRIT Supervisor is running as root.  Privileges were not dropped because no user is specified in the config file.  If you intend to run as root, you can set user=root in the config file to avoid this message.
2024-04-12 13:56:33,368 INFO Included extra file "/etc/supervisor/conf.d/supervisords.conf" during parsing
2024-04-12 13:56:33,374 INFO RPC interface 'supervisor' initialized
2024-04-12 13:56:33,374 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2024-04-12 13:56:33,375 INFO supervisord started with pid 1
2024-04-12 13:56:34,378 INFO spawned: 'app' with pid 7
2024-04-12 13:56:35,380 INFO success: app entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
npm ERR! code EAI_AGAIN
npm ERR! syscall getaddrinfo
npm ERR! errno EAI_AGAIN
npm ERR! request to https://registry.npmjs.org/kill-port failed, reason: getaddrinfo EAI_AGAIN registry.npmjs.org

npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2024-04-12T13_56_34_686Z-debug-0.log

Here is the 2024-04-12T13_56_34_686Z-debug-0.log as well.

0 verbose cli [
0 verbose cli   '/usr/bin/node',
0 verbose cli   '/usr/share/nodejs/npm/bin/npm-cli.js',
0 verbose cli   'exec',
0 verbose cli   '--yes',
0 verbose cli   '--',
0 verbose cli   'kill-port',
0 verbose cli   '3000'
0 verbose cli ]
1 info using npm@8.5.1
2 info using node@v12.22.9
3 timing npm:load:whichnode Completed in 0ms
4 timing config:load:defaults Completed in 2ms
5 timing config:load:file:/usr/share/nodejs/npm/npmrc Completed in 4ms
6 timing config:load:builtin Completed in 5ms
7 timing config:load:cli Completed in 6ms
8 timing config:load:env Completed in 1ms
9 timing config:load:file:/app/.npmrc Completed in 0ms
10 timing config:load:project Completed in 2ms
11 timing config:load:file:/root/.npmrc Completed in 0ms
12 timing config:load:user Completed in 0ms
13 timing config:load:file:/etc/npmrc Completed in 0ms
14 timing config:load:global Completed in 1ms
15 timing config:load:validate Completed in 0ms
16 timing config:load:credentials Completed in 1ms
17 timing config:load:setEnvs Completed in 2ms
18 timing config:load Completed in 21ms
19 timing npm:load:configload Completed in 21ms
20 timing npm:load:setTitle Completed in 1ms
21 timing config:load:flatten Completed in 5ms
22 timing npm:load:display Completed in 7ms
23 verbose logfile /root/.npm/_logs/2024-04-12T13_56_34_686Z-debug-0.log
24 timing npm:load:logFile Completed in 9ms
25 timing npm:load:timers Completed in 0ms
26 timing npm:load:configScope Completed in 0ms
27 timing npm:load Completed in 39ms
28 timing command:exec Completed in 94145ms
29 verbose type system
30 verbose stack FetchError: request to https://registry.npmjs.org/kill-port failed, reason: getaddrinfo EAI_AGAIN registry.npmjs.org
30 verbose stack     at ClientRequest.<anonymous> (/usr/share/nodejs/minipass-fetch/lib/index.js:110:14)
30 verbose stack     at ClientRequest.emit (events.js:314:20)
30 verbose stack     at TLSSocket.socketErrorListener (_http_client.js:427:9)
30 verbose stack     at TLSSocket.emit (events.js:326:22)
30 verbose stack     at emitErrorNT (internal/streams/destroy.js:92:8)
30 verbose stack     at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)
30 verbose stack     at processTicksAndRejections (internal/process/task_queues.js:84:21)
31 verbose cwd /app
32 verbose Linux 6.1.79-Unraid
33 verbose argv "/usr/bin/node" "/usr/share/nodejs/npm/bin/npm-cli.js" "exec" "--yes" "--" "kill-port" "3000"
34 verbose node v12.22.9
35 verbose npm  v8.5.1
36 error code EAI_AGAIN
37 error syscall getaddrinfo
38 error errno EAI_AGAIN
39 error request to https://registry.npmjs.org/kill-port failed, reason: getaddrinfo EAI_AGAIN registry.npmjs.org
40 verbose exit 1
41 timing npm Completed in 94488ms
42 verbose code 1
43 error A complete log of this run can be found in:
43 error     /root/.npm/_logs/2024-04-12T13_56_34_686Z-debug-0.log

samssausages commented 5 months ago

vlan br network on unraid

Just curious, as I'm also on unraid and this caught my attention. Are you using macvlan or ipvlan? do a docker network ls to check.

ShredderCody commented 5 months ago

Are you using macvlan or ipvlan?

I'm on ipvlan.

root@ServerOfThings:~# docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
3d56a2f76549   br0       ipvlan    local
738c523b350e   br0.33    ipvlan    local
614299d3100f   br0.66    ipvlan    local
141d492bac7e   bridge    bridge    local
3f38d38cd9b7   host      host      local
4395f5a5a799   none      null      local
root@ServerOfThings:~#

For reference the node is running on the br0.66 vlan.

jnbarlow commented 5 months ago

Interesting. I'm running macvlan and have most of my service docker containers running on their own IP.

There's part of the startup script I had to include that kills anything running on port 3000 because node wouldn't start, but with it being an empty OS, I couldn't figure out what was running to begin with (probably where that supervisor error is coming from). It SOUNDS like I can just include user=root in the config and it might work for you.

I'll need to set up a dev unraid to do this against I think though. I'm too afraid of breaking my internal system messing with the networking too much ;) is this a blocker for you @ShredderCody ? Can you still use the app or is it busted in your config?

ShredderCody commented 5 months ago

is this a blocker for you @ShredderCody ? Can you still use the app or is it busted in your config?

Its seems like just the webui fails, the monero node seems to work just fine as far as I can tell. I'll help troubleshoot the best I can on my setup.

It SOUNDS like I can just include user=root in the config and it might work for you.

I added user=root to the bitmonero.conf in /mnt/user/appdata/monero-nodeboard and restarted the container. This causes the same error as well as the node now failing to start. My assumption is the failure to start has to do with the "Unrecognized option 'user' in config file." error. I've attached the full log as well as my bitmonero.conf.

Full log:

/usr/lib/python3/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
  self.warnings.warn(

> monero-dashboard@1.1.0 start
> npm install --production && node server/index.js

2024-04-13 03:18:42,424 INFO exited: app (exit status 1; not expected)
2024-04-13 03:18:42,426 INFO spawned: 'app' with pid 32
2024-04-13 03:18:43,427 INFO success: app entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
npm ERR! code EAI_AGAIN
npm ERR! syscall getaddrinfo
npm ERR! errno EAI_AGAIN
npm ERR! request to https://registry.npmjs.org/kill-port failed, reason: getaddrinfo EAI_AGAIN registry.npmjs.org

npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2024-04-13T03_18_42_704Z-debug-0.log
---------------------------------true---------------------------
-----------------------------------rpc-restricted-bind-port 18089 --rpc-restricted-bind-ip 0.0.0.0 --public-node---------------------------
Unrecognized option 'user' in config file.

> monero-dashboard@1.1.0 start
> npm install --production && node server/index.js

2024-04-13 03:20:17,644 INFO exited: app (exit status 1; not expected)

bitmonero.conf:

data-dir=/app/blockchain
rpc-use-ipv6=true
rpc-bind-ip=0.0.0.0
rpc-bind-port=18081
rpc-bind-ipv6-address=::0
confirm-external-bind=true
enable-dns-blocklist=true
user=root

jnbarlow commented 5 months ago

@ShredderCody so it looks like this is MIGHT be a DNS issue.

The startup command npx -y kill-port 3000 that is supposed to kill node is actually going out to npm and pulling down the kill-port package to run it.

Could you connect to the terminal in the container and try to hit things outside of your network? It should be grabbing something from DHCP, but if you have this in a vlan maybe something there is excluding the container for some reason.

Keep in mind I'm basing this solely on stack FetchError: request to https://registry.npmjs.org/kill-port failed, reason: getaddrinfo EAI_AGAIN registry.npmjs.org --- this whole line seems shady, and it would explain why the node comes up and can get outside connections (because that doesn't require DNS).

ShredderCody commented 5 months ago

@ShredderCody so it looks like this is MIGHT be a DNS issue.

Definitely a DNS issue. The issue was two-fold. On the DHCP server the VLAN is set to use the DNS servers 1.1.1.1 and 1.0.0.1, however because I use IPvlans (still trying to figure out Macvlans) they don't use DHCP to get the DNS server to use. It defaults to using the DNS set on Unraid, which in my case is a local DNS that the container in that VLAN cannot reach. The solution was to manually set the DNS with --dns="1.1.1.1" or a DNS it can actually reach. Thanks for the help!

samssausages commented 5 months ago

@ShredderCody Just FYI, unraid is not supporting macvlan right now if you also have a bridge on your interface. (i.e. br0, br1) There is a bug and most are getting crashes with "call trace" errors. So on Unraid the only option is ipvlan right now. Works fine usually, but some networking kit may not like multiple IP's sharing the same mac, so consideration does need to be made for that.

If you really need macvlan (most probably don't), work arounds include not running a bridge, or running a VM with docker.

jnbarlow commented 5 months ago

@samssausages funny thing is I've got macvlan running now, and I'm not seeing the call traces. I tried to switch it over and broke all of my dockers haha. I know just enough about networking to be dangerous, but not to fix this sort of thing.

After an upgrade I did have to rebuild it though, so they def don't really want you doing it.

samssausages commented 5 months ago

@jnbarlow weird, most of the complaints on the unraid forums are for macvlan combined with a bridge. But it wouldn't surprise me! I used to run macvlan and all was fine, for about two years and through unraid updates. Then they started saying to switch to ipvlan, so I did. After a week I switched back, as I didn't have any crashes and prefer macvlan. Within a day of switching back I crashed with call traces error. So now I'm stuck on ipvlan... I thought it was just me, but your story makes me wonder.

I did see limetech (unraid) open a bug report for the underlying linux driver/kernel and they were able to reproduce it, so do hope for a fix eventually.

ShredderCody commented 5 months ago

@samssausages macvlan in my situation is preferable but not required. I did see someone mention that that call traces depend on hardware, but I have no idea how true that is. I also depend in using bridging on my servers so maybe I will be ok to set up macvlan if I disable it.

jnbarlow commented 5 months ago

@samssausages yeah, my config was from early on, bridge, macvlan, no issues... I've upgraded a few times and similar to @ShredderCody I tried to switch and broke everything so I went back. I was running into not being able to completely get rid of the old bridge, it was a mess. I've also not really been able to get ipv6 to work in the containers because of it.

samssausages commented 5 months ago

@ShredderCody @jnbarlow Just for reference, here is the official bug report with more info on them recreating the error. https://bugzilla.kernel.org/show_bug.cgi?id=217777

jnbarlow commented 5 months ago

@samssausages @ShredderCody That's crazy. I don't see any of these errors in my syslogs.

My unraid server has a bonded network, so maybe that's the difference? It's a bridge of a bridge?

samssausages commented 5 months ago

@jnbarlow I would guess it's probably hardware or something else on the network, but I really don't know. I'm also running a bond and then a bridge on top. Using a Broadcom BCM57416 chipset

I also run a debian VM for docker on a different proxmox server, there I'm using a br0 with macvlan and no issues. (virtio for the vm NIC, Intel X722 on the host)

jnbarlow / monero-nodeboard

Failed to kill port 3000 results in inablity to access Dashboard #7