google / cadvisor

Analyzes resource usage and performance characteristics of running containers.
Other
17.16k stars 2.32k forks source link

cadvisor status complete in swarm cluster #2310

Open jchorier opened 5 years ago

jchorier commented 5 years ago

Hi, since few day all my cadvisor containers in my swarm cluster are in complete state. so no container running et no more metrics into prometheus.

Docker version 18.09.6 swarm
OS Ubuntu 18.04.2 LTS (GNU/Linux 4.18.0-1025-azure x86_64)

how is configured

cadvisor:
    image: google/cadvisor:v0.33.0
    command: -logtostderr -docker_only
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
      - /var/run/docker.sock:/var/run/docker.sock:ro
    networks:
      - agent_network
    deploy:
      mode: global
      restart_policy:
        condition: on-failure
      resources:
        limits:
          cpus: '0.1'
          memory: 128M
        reservations:
          cpus: '0.1'
          memory: 64M

no log are available so I modified the conf to add -v=4 parameter so I have some logs but nothing readable for me I restart the container , the docker deamon, redeploy the stack, upgrade docker deamon to 19.03.1 but nothing works

complete log cadvisor.log there is a lot of msg like Factory "docker" was unable to handle container "/" Factory "containerd" was unable to handle container "/" Error trying to work out if we can handle /: / not handled by systemd handler

Any Ideas on how to fix this issue ? or a workaround ?

thanks

dashpole commented 5 years ago

Those logs look normal. Only one of the container handlers is expected to be able to handle a given container. The container terminated due to an external signal: Exiting given signal: terminated

jchorier commented 5 years ago

Hi, it's my problem , I don't find into the logs any msg why container stop and why with status complete. I will explore docker logs ... BUT I found a workaround: If I redeploy with tag latest (instead of v0.33.0) all works fine (containers running and metrics are good) any ideas why ? thanks