Dozzle running in multi node enviroments like Swarm

githubbiswb commented 1 year ago

Is your feature request related to a problem? Please describe. Using Dozzle in a cluster setup, for example docker swarm and not wanting to expose 2375 or 2376

Describe the solution you'd like If the Dozzle containers could open another port, or use a subfolder on the exisiting 8080 to make the data calls they need, each node in the cluster could run a dozzle container, and then your "Master" container would use the DOZZLE_REMOTE_HOST environment variable to call to them, keeping all the communication inside the docker network as well. Swarm uses internal DNS and my guess is other orchestrators do as well e.x. DOZZLE_REMOTE_HOST = 'dozzle01:8080/gettingthedata , dozzle02:8080/gettingthedata, dozzle03:8080/gettingthedata'

Describe alternatives you've considered What I did is put a reverse proxy in front of all dozzle instances, and redirect to the correct container by the subfolder e.x. mydomain.com/dockernode01logs or mydomain.com/dockernode02logs or mydomain.com/dockernode03logs etc....

What I considered was grabbing this container https://github.com/Tecnativa/docker-socket-proxy and having each host run one, then make a docker network they all talk on which keeps the port 2375 behind the scenes, but it needs --privileged and I try to avoid that like the plague

Additional context Love the project, very clean and I am a fair bit out of my depth when it comes to making the things make calls and connect to each other so this might be crazy and feel free to just close if it is, but if you can see the vision, awesome!

One other thing if you implemented you might want to make an environment variable that tells dozzle its okay if it can't connect to its docker socket, since in my setup I imagine above if I have 5 hosts I would run 6 dozzle containers. 5 that collect the info, the 6 gathers and displays, but doesn't need to connect to its docker socket since one of the collectors would have that handled

amir20 commented 1 year ago

In the beginning, when I was planning to support multi-host setup, I was planning to do exactly this. Something very similar to what you suggested. This was all discussed in https://github.com/amir20/dozzle/issues/1608

Getting a list of all nodes was challenging. I had to roll my own distributed solution.

So after a lot of discussion, I opted for a simipler solution. Note that people using more than one node makes up for only 0.3% total of users. So it didn't make sense for me to invest a lot of time.

What I did is put a reverse proxy in front of all dozzle instances, and redirect to the correct container by the subfolder e.x. mydomain.com/dockernode01logs or mydomain.com/dockernode02logs or mydomain.com/dockernode03logs etc....

This is amazing. Maybe you can share your configuration.

What I considered was grabbing this container https://github.com/Tecnativa/docker-socket-proxy and having each host run one, then make a docker network they all talk on which keeps the port 2375 behind the scenes, but it needs --privileged and I try to avoid that like the plague

I hear you and that's probably why a lot of people don't like the solution.

Love the project, very clean and I am a fair bit out of my depth when it comes to making the things make calls and connect to each other so this might be crazy and feel free to just close if it is, but if you can see the vision, awesome!

Thanks! I think it's clean because I do try to keep it minimally complicated.

One other thing if you implemented you might want to make an environment variable that tells dozzle its okay if it can't connect to its docker socket, since in my setup I imagine above if I have 5 hosts I would run 6 dozzle containers. 5 that collect the info, the 6 gathers and displays, but doesn't need to connect to its docker socket since one of the collectors would have that handled

I am not sure if I understand this question. But recently, I made it so that docker socket is not a requirement anymore. Is that what you are asking?

I won't close this just yet. Ideally, there is a simpler way to get Dozzle running for swarm. I just can't think of anything except creating my own "workers" across all nodes which is super difficult for a one-man project.

githubbiswb commented 1 year ago

I am not sure if I understand this question. But recently, I made it so that docker socket is not a requirement anymore. Is that what you are asking?

Interesting, well I tried today to run without it and it wouldn't come up, but I also may have had other things not configured correctly

This is amazing. Maybe you can share your configuration.

Where should I post my reverse proxy config and swarm setup, its rather simple, just repetitive. Its also on the work machine so I will have to post it tomorrow anyway

githubbiswb commented 1 year ago

A few notes and samples from my setup. I can't direct copy and paste from work, but I will share as much as I can.

I run an nginx reverse proxy, standard nginx image
I have a domain I use internally, let's call it containers.com
I have 12 hosts that I run docker swarm on, name them dockswarm01 - dockswarm12
I have an internal network to docker I have setup and both nginx and dozzle containers must connect to it, let s call it DockInteralComms
I have my reverse proxy and dozzle configs seperated becuase my reverse proxy does a lot more than just dozzle
My dozzle yaml only shows one host, but I repeat that 12 times, changing the service name, the base user variable and the constraint on the host
My subfolder for dockerswarm01 only shows one host, but I repeat that 12 times making 12 different files, changing the file name, the location and the upstream_app variable
I am at dozzle version 4.10.20
I am at nginx:1.25.1-alpine3.17-slim
I used subfolders to my domain, you can also use subdomains

Docker swarm dozzle.yml setup

version 3.7
services:
  dozzle01:
    image: amir20/dozzle:v4.10.20
    networks:
      - DockInternalComms
    volumes:
      - type: bind
         source: /var/run/docker.sock
         target: /var/run/docker.sock
         read_only: true
     environment:
       DOZZLE_BASE: /dockswarm01
     deploy:
        replicas: 1
        update_config:
          delay: 20s
          failure_action: rollback
        placement:
          constraints:
            - node.hostname == dockswarm01
networks:
  DockInternalComms:
     external: true

*My nginx yaml:**

version 3.7
services:
  httpsreverseproxy:
    image: nginx:1.25.1-alpine3.17-slim
    networks:
      - DockInternalComms
    ports:
      - 443:443
    volumes:
       - /home/nginx/html:/usr/share/nginx/html
       - /home/nginx/templates:/etc/nginx/templates
       - /home/nginx/subdomains:/etc/nginx/subdomains
       - /home/nginx/subfolders:/etc/nginx/subfolders
       - /home/nginx/certs:/certs:ro
     environment:
       NGINX_PORT: 443
     deploy:
       replicas: 1
       update_config:
         delay: 20s
         failure_action: rollback

networks:
  DockInternalComms:
     external: true

My default.conf.template for the nginx container, I am only going to show lines we change, lots more can be in there. Also if you know the work of the linuxserver.io team, these two genius lines for subfolders and subdomains come right from them. RM your stack and then deploy your stack for changes in this file to take effect

server {
  include /etc/nginx/subfolders/*.subfolder.conf;
}

include /etc/nginx/subdomains/*.subdomain.conf;

*My dockswarm01.subfolder.conf file and again shout out to the linuxserver.io guys who do this with variables, instead of just one connection string which allows services to be down and your reverse proxy to still come up, then it all connects later once the service is available**

location ^~ /dockerswarm01logs/ {
  proxy_set_header Host $host;
  proxy_set_header X-Real-IP $remote_addr;

  resolver 127.0.0.11 valid=30s ipv6=off;

  set $upstream_app dozzle01;
  set $upstream_port 8080;
  set $upstream_proto http;
  proxy_pass $upstream_proto://$upstream_app:$upstream_port;

  chunked_transfer_encoding off;
  proxy_buffering off;
  proxy_cache off;
}

Now I can go to containers.com/dockerswarm01logs/ and see dozzle on host 01, or containers.com/dockerswarm09logs/ and see dozzle on host 09 etc etc. Also make sure to reload your nginx container with a reload command to the container, or just rm the stack and deploy again like you did above.

As stated above, that all won't work right out of the box, but it should get you a long way to where you need to be. You should end up with as many services in your dozzle yaml file as you have subfolder entries. For subdomain people, that all should work for your config, check out the templates offered by the linuxserver.io team on their swag container, very helpful info there

amir20 commented 1 year ago

Thanks @githubbiswb. The reason I asked for this is to figure out if Dozzle can help this setup to be in anyway quicker.

I think one of the pain points is that you have to create N Dozzle configurations for each base. Where really the only thing that is changing is DOZZLE_BASE: /dockswarm01. I believe with swarm you pass template variables and described here. I wonder if it would make sense for this to be BASE=/dockerswarm{{.Task.Slot}}.

I have no easy way of testing this because I would have to setup swarm with a few nodes.

But for now, let me keep this open and see if Dozzle has a way to improve your life. Let me know if you can think of any...

Thanks!

githubbiswb commented 1 year ago

Thanks @githubbiswb. The reason I asked for this is to figure out if Dozzle can help this setup to be in anyway quicker.

I think one of the pain points is that you have to create N Dozzle configurations for each base. Where really the only thing that is changing is DOZZLE_BASE: /dockswarm01. I believe with swarm you pass template variables and described here. I wonder if it would make sense for this to be BASE=/dockerswarm{{.Task.Slot}}.

I have no easy way of testing this because I would have to setup swarm with a few nodes.

But for now, let me keep this open and see if Dozzle has a way to improve your life. Let me know if you can think of any...

Thanks!

I will test that out tomorrow when I am back on the office computers and let you know

amir20 commented 1 year ago

Sounds good! You do need to run Dozzle on all nodes for Slot to work. Let me know how it works out. But more importantly, let me know if any settings or configurations can be easier from Dozzle.

amir20 commented 1 year ago

@githubbiswb the more I thought about this, I don't think it would work. Dozzle would be deployed as a service and the networking would be across all the instances. So you wouldn't be able to choose an instance.

I don't think there is much Dozzle can do to improve getting it setup across multiple nodes. I would have to find a way to get all instances to communicate which is hard.

githubbiswb commented 1 year ago

I haven't had a chance to test yet, but I am also looking at running this stack as a compose file, which works a little different than the stack command. I still might be able to do it with node.lables though, exploring that option

As far as containers talking to each other, all swarm needs is a network setup between the hosts which I use all the time

So there would be a dozzle network, and each dozzle container could reach all other dozzle containers by their DNS name, something docker dns internal to the cluster handles

amir20 commented 1 year ago

So there would be a dozzle network, and each dozzle container could reach all other dozzle containers by their DNS name, something docker dns internal to the cluster handles

The problem with that is discovery. Say there is 3 instances. How does one of the Dozzle instances query the other two? I don't think the DNS names of all instances are passed to Docker. If it did, it would be great. I could then just query all Dozzle instances for their own data.

githubbiswb commented 1 year ago

So the DNS name comes from the service itself

version: '3'

services:
  dozzle01:
.
.
.
  dozzle02:
.
.
.

Then dozzle01 can reach dozzle02 with a shared network. You still have to define each one as service which makes for a big file, but then one could gather all the reporting data into a single dozzle instance and not need the reverse proxy to point to N hosts subfolders

amir20 commented 1 year ago

Oh, with Docker swarm you can say run a containers on all nodes.

version: '3'
services:
  dozzle:
    image: ...
    deploy:
      mode: global

Then it gets deployed on each node. I think that would be the right way to do it. Explicitly calling it out for each node feels more work than it needs to.

githubbiswb commented 1 year ago

You are correct about global and I looked at that, but the problem then becomes the reverse proxy goes to port 8080 and swarm isn't sure which node you mean it just picked at random. (As I recall, it was a few days ago when I gave global a shot)

Sadly, it doesn't make them DNS reachable with something like dozzle.1, dozzle.2 or anything like that or keep it static to the hosts.

With that said, perhaps the global tag would then set those variables you talked about before that nginx could perhaps use. I have to test to see if that would be the case

amir20 commented 1 year ago

Sadly, it doesn't make them DNS reachable with something like dozzle.1, dozzle.2 or anything like that or keep it static to the hosts

Yep, exactly. That's what I mean with my original comment. I don't think you can target a specific instance.

With that said, perhaps the global tag would then set those variables you talked about before that nginx could perhaps use. I have to test to see if that would be the case

I think in the perfect world, if there was a way for Docker to let me communicate between instances then it would be easy to build. But I don't think it does...

githubbiswb commented 1 year ago

I think in the perfect world, if there was a way for Docker to let me communicate between instances then it would be easy to build. But I don't think it does...

We might be at risk of talking in circles here and I don't want to do that to you, this is a great product even if it can't support swarm or multiple hosts without sharing the docker port over the network, but with that said

The containers can absolutely talk to each other as long as the are joined on the same overlay network. I do this all the time in swarm. Its how my reverse proxy even works, it goes to the dozzle containers whom share a network with it. Port 8080 is not exposed in anyway on my setup, it stays under the covers of the docker network. In my original compose swarm config, no ports are listed.

Yet the container still listens on 8080 and the reverse proxy can talk to it.

Dozzle containers could also talk to each other on 8080 as well, or a different port, and scrape what they need, present it all in one GUI. The trick is you would need to know the dozzlecontainer names, and in swarm we provide that with the service.

Dozzle, the GUI container could simply reference the other hosts by that service name in its Extra_hosts field you already use, so you don't have to figure out what they named their services. They include that in the Environment Variable.

So my example above would look like

DOZZLE_REMOTE_HOST: dozzle01:8080, dozzle02:8080, etc.....

So maybe as I think about it, that is the feature I am actually asking for.

Could dozzle use another dozzle container on port 8080 as a remote host input? If so, problem solved. Just not sure if it can scrape that way and how hard it would be to implement

amir20 commented 1 year ago

We might be at risk of talking in circles here and I don't want to do that to you, this is a great product even if it can't support swarm or multiple hosts without sharing the docker port over the network, but with that said

I enjoy these conversations. I actually think we are getting some where. I found a few things that I'll try touch on...

Dozzle containers could also talk to each other on 8080 as well, or a different port, and scrape what they need, present it all in one GUI. The trick is you would need to know the dozzlecontainer names, and in swarm we provide that with the service

DOZZLE_REMOTE_HOST: dozzle01:8080, dozzle02:8080, etc.....

Yep, agreed that with overlay network it should. With your example however, dozzle01 and dozzle02 are two difference services. I don't think this is intuitive. Deployers can create one service per node, but that would be costly to maintain. There has to be a better way! And I think there is. Thanks to you, I recently discovered deploy_mode. Changing deploy_mode to dnsrr allows each replica to be discovered. This is EXACTLY what I was looking for.

Could dozzle use another dozzle container on port 8080 as a remote host input? If so, problem solved. Just not sure if it can scrape that way and how hard it would be to implement

Not currently implemented. I think the right choice would be minimal setup for swarm. Using deploy_mode I could get a list of all the replicas and connect to :8080. Then it would be very similar to remote host architecture, except use other APIs to fetch the data.

Now to be realistic, I haven't seen many people use Dozzle with Swarm. I wish they had because that would be ultimate use case to break into enterprise. My data shows that only about 1% are using remote hosts.

After learning about your workaround, I think it's actually a pretty good idea. I want to test if deply_mode works but probably won't implement it as it is so much work for very few people.

amir20 commented 1 year ago

So this is exciting

I did test with deploy_mode.

Something like:

version: "3.7"
services:
  dozzle:
    image: amir20/dozzle:localhost
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    deploy:
      replicas: 2
      endpoint_mode: dnsrr

Print DNS in Go produces:

level=info msg="A in 10.0.3.9"
level=info msg="A in 10.0.3.10"

So it does work :)

It just a matter of implementing it. I wonder if you can use this too. And somehow tell your proxy to use of the services by IP.

githubbiswb commented 1 year ago

yeah, that does look interesting. I keep trying to find time to work on this and they keep pulling me to other stuff, I may just dev this in my lab tonight because this looks good for my own use too

Thank you for all of your hard work on this!

githubbiswb commented 1 year ago

Okay, did some digging as well with dnsrr, you can also use that in global mode which really cuts down on the needed compose file. And with another container did an nslookup on the service name of just dozzle, and sure enough it returned all of the endpoints IPs. So as long as the service name was mentioned as the extra host, and the dns return told the master dozzle it needed to grab lots of hosts, not just one, I could see a pretty simplistic docker swarm setup using dozzle for logs. With that said, I also get it, putting in a lot of work to get it working for a small majority doesn't seem to have the pay off it needs.

With my reverse proxy I think it would still round robin aka almost at random pick which IP it would proxy for. Unless nginx had a way to show a selector and you would pick which you wanted. But since the IPs change all the time, and there isn't a host that gets the same IP, the selector wouldn't be that helpful. Perhaps nginx could display all of the pages though on a single page or something like that, feeling like iframes?

amir20 commented 1 year ago

And with another container did an nslookup on the service name of just dozzle, and sure enough it returned all of the endpoints IPs.

I wonder if Dozzle could just look up itself as a service.

All that said, with Docker Swarm , there is also a notion of service logs can be viewed with docker service logs .... Ideally, Dozzle wouldn't have to worry about nodes at all and just use the API. However, with my experience Docker swarm was a little buggy and even the logs wouldn't load some times.

Doing all these DNS stuff is just a hack around the real problem.

I am going to close this issue for now. Thanks for the conversation. Good to know we have options if I wanted to implement something myself.

Perhaps nginx could display all of the pages though on a single page or something like that, feeling like iframes?

Yea, but I think you would have to do that yourself. It probably wouldn't be too hard to just throw a proxy written in anyway language with a drop down.

amir20 / dozzle

Dozzle running in multi node enviroments like Swarm #2324