NixOS / hydra

Hydra, the Nix-based continuous build system
http://nixos.org/hydra
GNU General Public License v3.0
1.1k stars 291 forks source link

hydra-queue-runner doesn't reconnect after a postgresql server restart, staying stuck forever #1336

Closed delroth closed 3 months ago

delroth commented 5 months ago

Describe the bug

After we restarted the PostgreSQL database for hydra.nixos.org, the queue runner just got stuck logging it lost its connection and never reconnected on its own. I had to manually restart it.

Expected behavior The queue runner properly and gracefully recovers from a temporary PostgreSQL connection failure.

Hydra Server:

Please fill out this data as well as you can, but don't worry if you can't -- just do your best.

Additional context

Logged forever every 10s:

Jan 12 22:14:13 rhea hydra-queue-runner[4016980]: main thread: Lost connection to the database server.
Jan 12 22:14:13 rhea hydra-queue-runner[4016980]: queue monitor: Lost connection to the database server.