Closed falkheiland closed 3 months ago
Hi @falkheiland!
That is correct. The missing part for you is setting up each node. Dozzle v7 implemented all logic to manage services, stacks across nodes. However, the nodes still need to be setup using DOZZLE_REMOTE_HOST
. This is not perfect but I am thinking about it.
Here is an example using socket proxy with swarm that should work:
services:
dozzle:
image: amir20/dozzle:latest
ports:
- "7575:8080"
environment:
DOZZLE_REMOTE_HOST: tcp://<yourfirstdockernodehostnamehere>-doz_proxy:2375,tcp://<yourseconddockernodehostnamehere>-doz_proxy:2375,etc...
deploy:
replicas: 1
update_config:
delay: 20s
failure_action: rollback
proxy:
image: tecnativa/docker-socket-proxy:0.1.2
hostname: "{{.Node.Hostname}}-{{.Service.Name}}"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
CONTAINERS: 1
INFO: 1
deploy:
mode: global
update_config:
delay: 20s
failure_action: rollback
Note the use of DOZZLE_REMOTE_HOST
as documented https://dozzle.dev/guide/remote-hosts. Then each swarm node needs to have tecnativa/docker-socket-proxy
.
In the future I'd like to leverage mode: global
and remove the need for proxy. However that is not implemented now. Currently, Dozzle does the swarm grouping based on what it can see. Other solutions like dockge or portainer have agents. I think that's too much personally. But I am thinking maybe Dozzle can communicate with itself.
One of the biggest reason I haven't implemented is that I feel remote hosts are a super set of everything. And it does work with swarm too. The only downside is that one would need to manage their own nodes. So if you have a lot of nodes updating then it would be a pain.
Thanks!
i did the changes and am now facing another problem.
time="2024-06-19T17:51:54+02:00" level=warning msg="Could not connect to remote host tcp:n1-monitoring_dozzle_proxy:2375: error during connect: Get \"http://n1-monitoring_dozzle_proxy:2375/v1.45/containers/json?all=1\": dial tcp: lookup n1-monitoring_dozzle_proxy on 127.0.0.11:53: server misbehaving"
i can ping the hostname from all the dozzle_proxies to all the dozzle_proxies, i also can get the file http://n3-monitoring_dozzle_proxy:2375/v1.45/info
(which also appears in the logs) form all the dozzle_proxies.
also a nc on 2375 is responding from dozzle_proxy to dozzle_proxy.
but i cannot test from the dozzle container itself - since i cannot get a console session.
OCI runtime exec failed: exec failed: unable to start container process: exec: "sh": executable file not found in $PATH: unknown
* The terminal process "/bin/bash '-c', 'docker exec -it 5b71b6cdad50f39858b6e096a739d01fe52be024926f1f90c06f2a5a45dbb5bd sh'" failed to launch (exit code: 126).
* Terminal will be reused by tasks, press any key to close it.
is there way to debug from that container itself?
Hmmm. Can you enable debug mode? https://dozzle.dev/guide/debugging
Admittedly, I have never seen server misbehaving
error. I will do a little research.
Does curl http://n1-monitoring_dozzle_proxy:2375/v1.45/containers/json
work?
I guess this sort of answers my second question. One con of having to use proxy is the confusion of setting it up. I imagine having an agent similar to your first setup would have avoided that. :)
Found some info at https://stackoverflow.com/questions/28332845/docker-network-issue-server-misbehaving which might suggest DNS issues. I have never seen these so I think something is different about your setup.
I forgot to answer your last question
is there way to debug from that container itself?
Not easily. There is nothing installed on the docker container except the Dozzle binary. But you could change Dockerfile
to build from alpine instead and have access to curl
and other commands.
However, first I would look at resolving the issues as suggested in the stackoverflow.
Any update? Did it work?
so. after a lot of trial and error - i finally deployed the same thing to a seamingly similar configured cluster - and here it works just fine.
i will try and figure out what is different between those installations and will post here if i find it out.
@amir20 thank you for your quick response, love the product!
well, i do face the same problem on that initially working cluster as i did with the first one. the name resolution via name spaces seems to cause issues, at least for me. as much as i like the product - i will not be able to use it in this environment.
@falkheiland I had seen some other message from you but they are not here. I am not sure what happened. I'll try to respond but might be losing some context.
if i run the stack only with the internal network (obv no web ui access via traefik then) it works ( from the logs). as soon as i add the overlay network in the mix, the doozle service does not seem to be able to get the name resolution for the dozzle-proxies.
I have noticed in all your examples remote host is pointing to some domain. Have you tried using actual IP address of the node eg. DOZZLE_REMOTE_HOST: tcp://10.0.1.2:2375
My hunch is that Docker doesn't like the DNS.
since i cannot get a console session on the dozzle container to debug, i configured the dozzle-proxy containers to also have both the external and the internal networks running. here the name resolution via the set hostnames works w/o a problem - the dozzle container seems to have that problem only for me
Would be helpful if I create a PR with alpine
?
i will not be able to use it in this environment
It's unfortunate. If you can give me some way to reproduce this in AWS, DigitalOcean or something else then I can try testing it myself. I use Orbstack which comes with VM support. I did setup 3 VMs and it did seem to work. I haven't tried attaching a proxy network. Maybe that's next.
Finally, I think a lot of these issues are related to using socket proxy. In the back of my mind, I think creating an agent that would remove the need for socket-proxy
might fix it. I have started thinking about it. It seems like a lot of work but maybe I can do it on my spare time as a fun project. It would require setting up gRPC, mesh, and distributed computing to have all agents talk to each other.
For now, if you are able to reproduce this for me using some kind of compose file and VMs locally then I can try to debug.
Created https://github.com/amir20/dozzle/issues/3052. Feedback welcomed.
since i have been away for the last days: #3052 (agent support) would be the best option imho.
@falkheiland try it out. Instructions are at https://github.com/amir20/dozzle/pull/3058
I still got some work to do but I think for your use case it should be pretty easy.
i am running dozzle in a 3 node docker swarm mode cluster (all nodes are manager nodes) behind traefik. when i open the dozzle webinterface, it only shows the containers on that node (which is selected by traefik). the
Swarm Mode
slider is not enabled by default.https://dozzle.dev/guide/swarm-mode says:
this seems to be the only requirement for the config.
i also tried using the dev.dozzle.group - label, since i am not sure about the services / no service usage. i am sure i am missing something here... can you maybe provide a docker swarm mode specific docker-compose.yml as reference?
docker-compose.yml:
dozzle service logs: