Open antifuchs opened 3 years ago
To make this more clear, I don't think this is an issue with startup of the two units as much as shutdown: If both units get stopped during a reboot, my suspicion is that zookeeper stops before kafka can deregister itself, leading to old registrations sitting around, which then prevent a successful startup until zookeeper has had a chance to clean out old sessions (10min into bootup or whatever the session timeout is).
Fwiw, kafka is moving away from zookeeper, maintaining its own metadata using its own means instead.
Re. KRaft, I have put up a PR for a -- hopefully -- unopinionated way of achieving this. I don't think I like making assumptions on where the controller lives whether or not it's Zookeeper or KRaft -- colocation is incidental from the POV of NixOS, I think. Not that it's a huge deal to place a systemd ordering, but it's also not a huge deal on the user end.
Regardless, PTAL at #203987 and the PR #224611 -- maybe I am moving in the wrong direction by making the module less smart, but I personally think it makes us (as in nixpkgs/NixOS) more flexible.
Describe the bug
I'm running apache-kafka connected to a local zookeeper (pretty much the default configuration if you enable both services), and on boot, the apache-kafka service often fails to start, with an error like:
This seems to indicate that there are broker registrations in the zookeeper directory that weren't properly cleared, as can happen when kafka doesn't shut down cleanly.
To Reproduce Steps to reproduce the behavior:
systemctl list-units --failed
About 75% of times I reboot the machine, apache-kafka is listed in the failed units, with the log indicating that it found an old broker ID.
Expected behavior
apache-kafka should start up porperly every time on boot.
Additional context
I'm pretty sure this is rooted in a missing dependency between the kafka service and the zookeeper service; as the nixos config already has a setting for kafka's zookeeper servers, we could put a
requires
&after
clause in kafka's unit if it should talk to localhost & zookeeper is also enabled.Notify maintainers @ragnard @srhb
Metadata
"x86_64-linux"
Linux 5.10.40, NixOS, 21.11.20210608.51bb9f3 (Porcupine)
yes
yes
nix-env (Nix) 2.4pre20210601_5985b8b
/nix/store/gc0lfq01vfmilfsld5mb0znjim801xxx-source
Maintainer information: