jittering / traefik-kop

A dynamic docker->redis->traefik discovery agent
MIT License
189 stars 14 forks source link

Container fails to restart automatically after server reboot #38

Open Kuppit opened 3 months ago

Kuppit commented 3 months ago

Hello,

I am experiencing an issue where the Traefik-Kop container does not restart automatically after a server reboot, despite having the restart: always policy set in the docker-compose.yml. After rebooting the server, the container exits with status code 2. The container logs do not provide further information on why the restart is not occurring.

Expected Behavior: The Traefik-Kop container should restart automatically after the server reboots.

Actual Behavior: The Traefik-Kop container does not restart and exits with status code 2.

Additional Information:

chetan commented 3 months ago

@Kuppit this only happens after a server reboot? Are you able to manually bring it up after intervening? Before trying those workarounds (tini, entrypoint), were you using the published docker image for kop or a custom one?

You can try enabling debug logs by setting DEBUG=1 in the environment and see if that sheds any light but I can't think of anything that would cause this. There should be some sort of output, at minimum a panic.

Kuppit commented 3 months ago

Yes, this issue only occurs after a server reboot. I am able to manually bring it up after intervening. Before trying those workarounds (tini, entrypoint), I was using the published Docker image.

Here is my docker-compose.yml configuration:

services:
  kop:
    image: ghcr.io/jittering/traefik-kop:0.13.3
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    environment:
      TZ: Europe/Paris
      KOP_HOSTNAME: xxxxxxx-xxx
      REDIS_ADDR: "redis:6379"
      KOP_POLL_INTERVAL: 10
      DOCKER_CONFIG: |
        ---
        docker:
          exposedByDefault: false
          useBindPortIP: true
          network: proxy
    networks:
      - proxy
    logging:
      options:
        max-size: "256k"
        max-file: 1
    restart: always

networks:
  proxy:
    external: true

I enabled debug logs by setting DEBUG=1 in the environment and rebooted the server. Here is the end of the logs after the reboot:

kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=mapei-training-postgres-1-91feedf4f9807f04e47c8bc32908319e2cbdc20927345d6a4bcfd46048ad1e53
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=mapei-production-postgres-1-1bd06babc815e83c18135e359b3a7da37e647af03ada3a774dc9bb2d7a70fae2
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=mapei-redis-1-aca9a5695028c3e4824b38fab5cab7bd537cfc3a9b53445846e6fd688170dfa4
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=mapei-rabbitmq-1-dec23a32ef25ed8727d5eb5f600e6732c86dabe9be284da78bdd619fd4c46596
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=mapei-clamav-1-b47e887cc5cc4c8eb1c2b2c670319323525d1787eb0a7e87161d994861ac8a91
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" container=auditd-filebeat-628f4098a3cbdab7cc76eb28a3a952012b44e65956e3b8c44be40562aaa25461 providerName=docker
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=docker-filebeat-0f4103fbd6fb6e9b8b7a69c3ac3d865c67ca1cbc0a776798fb9ade1242e59070
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=traefik-filebeat-b82a31c8a74492431521354039db8a176bb1d6727cbc917fe7171c8c357e5305
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=syslog-filebeat-41442ee394174edcdbf49d0e71c4e8e719611808e22b2b69343e69fa03c17e05
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=node-exporter-prometheus-afc23acb479744602055c475ab1174d432fba4ffff410c7f00d230e0e535c869
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=process-exporter-prometheus-e609ee4e632d8701a056a9ccefa325ca1003c229f5f871cbb05e879d37b56285
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=docker-state-exporter-prometheus-e4ebd46cf0484e9daeac8ced95f3929514850bd1c76901f0a5a0495189be6213
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Filtering disabled container" providerName=docker container=crowdsec-crowdsec-92ea337fd67875dd9e3f1666286a031be6b0517c7b90886c3d0e358a7e27cc65
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Configuration received: {xxxxxxxxxxxx}" providerName=docker
kop-1  | time="2024-07-05T16:00:02Z" level=debug msg="Skipping unchanged configuration." providerName=docker
kop-1  | time="2024-07-05T16:00:07Z" level=debug msg=tick
kop-1  | time="2024-07-05T16:00:12Z" level=debug msg=tick
kop-1  | time="2024-07-05T16:00:17Z" level=debug msg=tick

Thank you for your assistance.

chetan commented 3 months ago

Strange. Can you check syslog and/or dmesg for any output? Sounds like it could be getting killed by the system.

chetan commented 3 months ago

Probably unrelated, but looks like there is also a bug in setting a custom poll interval.

chetan commented 3 months ago

Tested the poll interval and it looks fine on my end. Is it possible the log output you pasted was not from the config above? I see 5 seconds between logs but you have configured 10. Here's what I see at 10 seconds:

time="2024-07-06T08:08:42-04:00" level=debug msg=tick
time="2024-07-06T08:08:42-04:00" level=debug msg="Provider connection established with docker 26.1.4 (API 1.45)" providerName=docker
time="2024-07-06T08:08:42-04:00" level=debug msg="Filtering disabled container" providerName=docker container=redis-testing-b771a41e664d174c66b9958c29ed760cc2323ebd2f757b2a1efd363c32faf4a0
time="2024-07-06T08:08:42-04:00" level=debug msg="Configuration received: {\"http\":{},\"tcp\":{},\"udp\":{},\"tls\":{}}" providerName=docker
time="2024-07-06T08:08:42-04:00" level=debug msg="Skipping unchanged configuration." providerName=docker
time="2024-07-06T08:08:52-04:00" level=debug msg=tick
time="2024-07-06T08:08:52-04:00" level=debug msg="Provider connection established with docker 26.1.4 (API 1.45)" providerName=docker
time="2024-07-06T08:08:52-04:00" level=debug msg="Filtering disabled container" container=redis-testing-b771a41e664d174c66b9958c29ed760cc2323ebd2f757b2a1efd363c32faf4a0 providerName=docker
time="2024-07-06T08:08:52-04:00" level=debug msg="Configuration received: {\"http\":{},\"tcp\":{},\"udp\":{},\"tls\":{}}" providerName=docker
time="2024-07-06T08:08:52-04:00" level=debug msg="Skipping unchanged configuration." providerName=docker
time="2024-07-06T08:09:02-04:00" level=debug msg=tick
time="2024-07-06T08:09:02-04:00" level=debug msg="Provider connection established with docker 26.1.4 (API 1.45)" providerName=docker
time="2024-07-06T08:09:02-04:00" level=debug msg="Filtering disabled container" providerName=docker container=redis-testing-b771a41e664d174c66b9958c29ed760cc2323ebd2f757b2a1efd363c32faf4a0
time="2024-07-06T08:09:02-04:00" level=debug msg="Configuration received: {\"http\":{},\"tcp\":{},\"udp\":{},\"tls\":{}}" providerName=docker
time="2024-07-06T08:09:02-04:00" level=debug msg="Skipping unchanged configuration." providerName=docker
Kuppit commented 3 months ago

Regarding the poll interval, I can confirm that the log output I pasted was from the configuration above. Originally, I had set KOP_POLL_INTERVAL to 10 seconds, but I changed it to 5 seconds for testing purposes. Despite this, the issue persists.

I have now set KOP_POLL_INTERVAL to 60 seconds, and after three reboots, I am still experiencing the same problem.

Additionally, I have several other containers running on this server without any issues at all. They all restart correctly after a server reboot.

Any assistance or guidance on resolving this issue would be greatly appreciated. Thank you for your time and help!