fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.92k stars 1.34k forks source link

Update/Reload without downtime #4624

Open daipom opened 2 months ago

daipom commented 2 months ago

Which issue(s) this PR fixes:

What this PR does / why we need it: See #4622.

Specification:

  1. The supervisor receives SIGUSR2.
  2. Spawn a new supervisor.
  3. Take over shared sockets.
  4. Launch new workers, and stop old processes in parallel.
    • Launch new workers with source-only mode
      • Limit to zero_downtime_restart_ready? input plugin
    • Send SIGTERM to the old supervisor after 10s delay from 3.
  5. The old supervisor stops and sends SIGWINCH to the new one.
  6. The new workers run fully.

Screenshot from 2024-10-11 09-38-28

Supported input plugins:

Needs following:

Docs Changes: TODO

Release Note: TODO

TODO:

daipom commented 1 month ago

The basic implementation is done. Some concept of #4654 is reflected. Thanks @Watson1978!