henrygd / beszel

Lightweight server monitoring hub with historical data, docker stats, and alerts.
MIT License
2.85k stars 89 forks source link

Cannot manage for agent to connect to hub (both in same docker-compose). Tried many combinations. #107

Closed garret closed 3 months ago

garret commented 3 months ago

I am running around 40 docker containers on a host with Nixos x86_64. Never had an issue of connecting some hosts among them (see *arr and downloader clients). However, I cannot manage at all to have the beszel agent and the hub connected between them. This is the docker-compose I am using:

    beszel-hub:
        container_name: beszel-hub
        restart: unless-stopped
        image: henrygd/beszel
        environment:
          - TZ=${TIMEZONE}
        ports:
          - 8090:8090
        volumes:
          - ${CONFIG_FOLDER}/beszel:/beszel_data

    beszel-agent:
        container_name: beszel-agent
        restart: unless-stopped
        image: henrygd/beszel-agent
        environment:
          - TZ=${TIMEZONE}
          - PORT=45876
          - KEY="ssh-ed25519 willnotcopyithere"
          - FILESYSTEM=/dev/nvme0n1 # set to the correct filesystem for disk I/O stats
        network_mode: host
        volumes:
          - /var/run/docker.sock:/var/run/docker.sock:ro
        depends_on:
          - beszel-hub

On the host I have opened the 45876 TCP port on the firewall and tested that I can connect to it via telnet succesfully. Both hub and agent logs seems fine after some days of running:

Hub:

2024/08/08 09:06:29 Server started at http://0.0.0.0:8090/
├─ REST API: http://0.0.0.0:8090/api/
└─ Admin UI: http://0.0.0.0:8090/_/

Agent:

2024/08/08 09:06:29 Found network interface: enp1s0 (755188674016 recv, 910853535670 sent)
2024/08/08 09:06:29 Found network interface: server (1619637328 recv, 4060427020 sent)
2024/08/08 09:06:29 Found network interface: tailscale0 (45475705 recv, 26415497 sent)
2024/08/08 09:06:29 Starting SSH server on :45876

I tried adding as host in the hub the following combinations: localhost, beszel-agent, 192.168.0.5 (lan IP address), 10.0.0.5 (wireguard IP address) but the agent is always down. I also tried with adding the host.docker.internal:host-gateway in the hub compose and then using host.docker.internal as hostname or disabling the network_mode: host and exposing just the 45876 port. Nothing results in the hub connecting to the agent.

On the hub I get the following error on the web interface logs:

ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

I also tried to cancel all the config of the hub and restart the container again (and thus getting a new SSH key).

Do you know what I am doing wrong? I have been really surprised in how difficult it has been for me to try this project. I am running most of the famous docker images and never had an issue.

henrygd commented 3 months ago

I don't think it's a connectivity issue. That error suggests it's an authentication issue with your key.

Please double check that the key in the compose file is the same key that Beszel gives you when you add a new system.

If it is, try changing your compose file to use colon syntax for environment variables like the example file. I suspect this may be the problem.

If that still fails, double check that you have both id_ed25519.pub and id_ed25519 files in your beszel_data directory.

garret commented 3 months ago

I checked and the public key is the same in the compose and when I try to add a new system. I also have the private and public key in the beszel config folder. Changed the compose with colon but now get:

dial tcp 127.0.0.1:45876: connect: connection refused
henrygd commented 3 months ago
ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

:point_up: This error means the agent was found but the key is incorrect. Likely because of the formatting of your compose env vars.

dial tcp 127.0.0.1:45876: connect: connection refused

:point_up: This error means the hub can't find the agent. If you're seeing this, you need to delete the system in the hub and re-add it using the method you used when you got the first error.

garret commented 3 months ago

I played more with the variables syntax and after playing with " (according to beszel-hub copied docker-compose) and '(according to example). I managed to solve by using the ". Therefore, now the key value is:

KEY: "ssh-ed25519 willnotwriteithere"

...and for some reason now it works. Thank you for the support. Will close the issue.

henrygd commented 3 months ago

No worries, glad you got it working