NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
16.71k stars 13.15k forks source link

Kubernetes etcd does not start correctly #124037

Open uhthomas opened 3 years ago

uhthomas commented 3 years ago

Describe the bug https://github.com/NixOS/nixpkgs/issues/59364 is relevant but has no conclusion.

Kubernetes builds ok now using https://github.com/NixOS/nixpkgs/pull/123741 but the the install does not succeed.

I've tried this on multiple machines and they all report the same thing.

I've tried reinstalling multiple times, and removing relevant dirs but the issue persists.

rm -rf /var/lib/cfssl /var/lib/kubernetes

To Reproduce

Add the following lines to a NixOS host's configuration.nix

{
  services.kubernetes = {
    roles = [ "master" "node" ];
    masterAddress = "api.kube";
    apiserverAddress = "https://api.kube:6443";
    easyCerts = true;
    apiserver = {
      securePort = 6443;
      advertiseAddress = "10.0.0.5";
    };
  };
}

Expected behavior A successful Kubernetes install.

Actual behavior

[ERROR] Deployment to c28593b8bf failed. Last 10 lines of logs:
[ERROR] May 22 16:06:18 c28593b8bf etcd[2188329]: Git SHA: GitNotFound
[ERROR] May 22 16:06:18 c28593b8bf etcd[2188329]: Go Version: go1.16.4
[ERROR] May 22 16:06:18 c28593b8bf etcd[2188329]: Go OS/Arch: linux/amd64
[ERROR] May 22 16:06:18 c28593b8bf etcd[2188329]: setting maximum number of CPUs to 32, total number of available CPUs is 32
[ERROR] May 22 16:06:18 c28593b8bf etcd[2188329]: peerTLS: cert = /var/lib/kubernetes/secrets/etcd.pem, key = /var/lib/kubernetes/secrets/etcd-key.pem, ca = , trusted-ca = /var/lib/kubernetes/secrets/ca.pem, client-cert-auth = false, crl-file =
[ERROR] May 22 16:06:18 c28593b8bf etcd[2188329]: open /var/lib/kubernetes/secrets/etcd.pem: no such file or directory
[ERROR] May 22 16:06:18 c28593b8bf systemd[1]: etcd.service: Main process exited, code=exited, status=1/FAILURE
[ERROR] May 22 16:06:18 c28593b8bf systemd[1]: etcd.service: Failed with result 'exit-code'.
[ERROR] May 22 16:06:18 c28593b8bf systemd[1]: Failed to start etcd key-value store.

Additional context Add any other context about the problem here.

Notify maintainers

@johanot @offlinehacker @saschagrunert

Metadata

# nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 5.10.37, NixOS, 21.05pre-git (Okapi)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.3.11`
 - channels(root): `"nixos-20.09.4132.52090c613ad"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`

Maintainer information:

# a list of nixpkgs attributes affected by the problem
attribute:
# a list of nixos modules affected by the problem
module:
cmacrae commented 3 years ago

I've been using this hack to work around this:

systemd.services.etcd.preStart = ''${pkgs.writeShellScript "etcd-wait" ''
  while [ ! -f /var/lib/kubernetes/secrets/etcd.pem ]; do sleep 1; done
''}'';
uhthomas commented 3 years ago

@cmacrae

[ERROR] May 24 13:23:52 5dc508ed7c systemd[1]: Starting etcd key-value store...
[ERROR] May 24 13:25:22 5dc508ed7c systemd[1]: etcd.service: start-pre operation timed out. Terminating.
[ERROR] May 24 13:25:22 5dc508ed7c systemd[1]: etcd.service: Control process exited, code=killed, status=15/TERM
[ERROR] May 24 13:25:22 5dc508ed7c systemd[1]: etcd.service: Failed with result 'timeout'.
[ERROR] May 24 13:25:22 5dc508ed7c systemd[1]: Failed to start etcd key-value store.

No luck.