grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.34k stars 3.38k forks source link

Loki driver prevents docker containers from starting #10415

Open Youpiiiii opened 1 year ago

Youpiiiii commented 1 year ago

Describe the bug I use docker rootless in proxmox VM and Loki driver prevents containers from starting when I reboot VM.

To Reproduce

  1. Install docker rootless in Proxmox VM (Debian)
  2. Install Loki driver plugin
  3. Use basic config: cat .config/docker/daemon.json { "dns-search": ["ad.domain.com"], "debug" : true, "log-driver": "loki", "log-opts": { "loki-url": "https://user:password@loki.ad.domain.com:3100/loki/api/v1/push", "loki-batch-size": "400" } }
  4. Reboot VM

Environment:

Errors Syslog error: dockerd-rootless.sh[607]: time="2023-08-31T23:17:57.039336754+02:00" level=error msg="1fca447249aae1b18520639e213e64df9d226d69c02d4f910d7a43bcfbc2aaaf cleanup: failed to delete container from containerd: container \"1fca447249aae1b18520639e213e64df9d226d69c02d4f910d7a43bcfbc2aaaf\" in namespace \"moby\": not found"

When I try to restart a container: Error response from daemon: Cannot restart container haproxy: driver failed programming external connectivity on endpoint haproxy (1fca447249aae1b18520639e213e64df9d226d69c02d4f910d7a43bcfbc2aaaf): Bind for 0.0.0.0:8404 failed: port is already allocated

I must restart docker with : systemctl --user restart docker

Sheikh-Abubaker commented 1 year ago

Hey there @Youpiiiii, I am a beginner in open source contributions and doesn't know that much but still for the sake of learning and contributing like a pro, I just went through the issue and figured out you're are trying to start a docker container name "haproxy" and the error you are encountering is due to that fact that the desired port i.e 8404 is allocated already, am I right ?

And if I am right I have some solutions in my mind :

  1. We can indentify the the process that has occupied port 8404 and the just stop it using appropriate commands.
  2. Or maybe we can modify the docker configuration to use some other port.
  3. If this issue is arising due to an existing instance of "haproxy" container, one can simply remove it and start a new one.

I have these solutions in my mind let me know which of these is doable ?

It would be very kind of you to enlighten me with some other useful information that I may have missed, really looking forward to learn and contributing to this issue on the go.

Youpiiiii commented 1 year ago

Hello Sheikh,

No the problem is most complicated. I have a feeling that dockerd-rootless.sh is trying to recreate containers on startup but that loki is preventing to stop the container. I have no problem with docker classic or docker with userns, the problem it's just with docker rootless. I've added "failed: port is already allocated" to help you think, but the bug is "failed to delete container from containerd" on startup.

But thank to try ;)

Sheikh-Abubaker commented 1 year ago

Thanks for the clarification, can you tell me how do I start working on this issue ? I mean from where should I start ?

Sheikh-Abubaker commented 1 year ago

Hello @Youpiiiii,

I just wanted to ask a simple question. do I need to clone this repo in Proxmox VM (Debian) and then get going with the information stated above to reproduce ?

It would be great if you could enlighten me with the prerequisites required to fix this issue, as I mentioned earlier I've just started with contributing to open source, really looking forward to learn and contribute.

Youpiiiii commented 1 year ago

Hello @Sheikh-Abubaker,

In trying to prepare a script for you to reproduce the problem, I found the culprit. I use systemd services to start some containers of monitoring like node exporter, cadvisor, etc. This blocks other containers that were in exited status at startup. I thought it was the loki driver because it was during installation that the problem appeared. I don't know why it conflicts with systemd, but at least it'll be easier to investigate. Example:

cat .config/systemd/user/docker.node.service
[Unit]
Description=Node exporter Service
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
Restart=always
ExecStartPre=-docker stop node
ExecStartPre=-docker rm node
ExecStartPre=docker pull prom/node-exporter:latest
ExecStart=docker run --rm --name node -p 9100:9100 prom/node-exporter

[Install]
WantedBy=default.target

I don't use this repo. I installed loki with this command : docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions