flannel-io / flannel

flannel is a network fabric for containers, designed for Kubernetes
Apache License 2.0
8.82k stars 2.87k forks source link

flanneld-v0.7.1 crashes after journald restart #2062

Open bitpeng opened 2 months ago

bitpeng commented 2 months ago

We use systemd to manage flanned-v0.7.1 in k8s node, one day the flanneld process exited but systemd did not restart it. We have set Restart=on-failure in systemd flanneld.service config.

Expected Behavior

Current Behavior

$ systemctl status flanneld
 flanneld.service - Flanneld overlay address etcd agent
   Loaded: loaded (/usr/lib/systemd/system/flanneld.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since 二 2024-08-30 06:06:32 CST; 4 months 21 days ago
 Main PID: 204496 (code=killed, signal=PIPE)

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

$ cat /usr/lib/systemd/system/flanneld.service
[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/flanneld
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=/usr/bin/flanneld-start $FLANNEL_OPTIONS
ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure

After in-depth check, We found the journald has restarted ,then it cause flanneld existed.

$ systemctl status systemd-journald

 systemd-journald.service - Journal Service
   Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service; static; vendor preset: disabled)
   Active: active (running) since 一 2024-08-29 16:42:39 CST; 4 months 22 days ago
     Docs: man:systemd-journald.service(8)
           man:journald.conf(5)
 Main PID: 159518 (systemd-journal)
   Status: "Processing requests..."
    Tasks: 1
   Memory: 4.0G
   CGroup: /system.slice/systemd-journald.service
           └─159518 /usr/lib/systemd/systemd-journald

Possible Solution

The flanneld-v0.7.1 is developed by golang lower version that does not support go mod. I execute the command git checkout v0.7.1 but found no go-version in glide.yaml or glide.lock. And I found that some golang software writen in golang-version < go1.6 will crashes after journald restart. like: https://github.com/moby/moby/issues/19728, https://github.com/moby/moby/pull/22460. Is it the same bug?

Steps to Reproduce (for bugs)

1. 2. 3. 4.

Context

Your Environment

rbrtbnfgl commented 1 month ago

Hi why are you using an older version of Flannel?