kubernetes-sigs / kubespray

Deploy a Production Ready Kubernetes Cluster
Apache License 2.0
15.74k stars 6.38k forks source link

ansible_lockfile in lookup when determining kubeadm_certificate_key #9916

Closed hgomez closed 1 week ago

hgomez commented 1 year ago

Environment:

  config file = /home/henri/Documents/kubespray/ansible.cfg
  configured module search path = ['/home/henri/Documents/kubespray/library']
  ansible python module location = /home/henri/Documents/kubespray/kubespray-venv/lib/python3.11/site-packages/ansible
  ansible collection location = /home/henri/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/henri/Documents/kubespray/kubespray-venv/bin/ansible-playbook
  python version = 3.11.2 (main, Feb  8 2023, 00:00:00) [GCC 12.2.1 20221121 (Red Hat 12.2.1-4)]
  jinja version = 2.11.3
  libyaml = True

Kubespray version (commit) (git rev-parse --short HEAD): 2.21.0

Network plugin used: default

Output of ansible run:

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible.errors.AnsibleError: An unhandled exception occurred while templating '{{ lookup('password', credentials_dir + '/kubeadm_certificate_key.creds length=64 chars=hexdigits') | lower }}'. Error was a <class 'ansible.errors.AnsibleError'>, original message: An unhandled exception occurred while running the lookup plugin 'password'. Error was a <class 'FileExistsError'>, original message: [Errno 17] Le fichier existe: b'/home/henri/Documents/BatCave/TerraformExperiments/kubespray/inventory/cluster1-rocky/credentials/b4b16befec53bb8b3281feab1bd9e824fb2fff14.ansible_lockfile'. [Errno 17] Le fichier existe: b'/home/henri/Documents/BatCave/TerraformExperiments/kubespray/inventory/cluster1-rocky/credentials/b4b16befec53bb8b3281feab1bd9e824fb2fff14.ansible_lockfile'
fatal: [k8sm3.hgomez.net]: FAILED! => {"changed": false, "msg": "AnsibleError: An unhandled exception occurred while templating '{{ lookup('password', credentials_dir + '/kubeadm_certificate_key.creds length=64 chars=hexdigits') | lower }}'. Error was a <class 'ansible.errors.AnsibleError'>, original message: An unhandled exception occurred while running the lookup plugin 'password'. Error was a <class 'FileExistsError'>, original message: [Errno 17] Le fichier existe: b'/home/henri/Documents/BatCave/TerraformExperiments/kubespray/inventory/cluster1-rocky/credentials/b4b16befec53bb8b3281feab1bd9e824fb2fff14.ansible_lockfile'. [Errno 17] Le fichier existe: b'/home/henri/Documents/BatCave/TerraformExperiments/kubespray/inventory/cluster1-rocky/credentials/b4b16befec53bb8b3281feab1bd9e824fb2fff14.ansible_lockfile'"}
ok: [k8sm1.hgomez.net]
ok: [k8sm2.hgomez.net]

Anything else do we need to know:

inventory/group_vars/k8s-cluster.yml contains

kubeadm_certificate_key: "{{ lookup('password', credentials_dir + '/kubeadm_certificate_key.creds length=64 chars=hexdigits') | lower }}"

ie: https://github.com/kubernetes-sigs/kubespray/blob/master/inventory/sample/group_vars/k8s_cluster/k8s-cluster.yml#L234-L235

It seems there is a lock when determining kubeadm_certificate_key for many simulatenous master (3 in my case) when templating in https://github.com/kubernetes-sigs/kubespray/blob/master/roles/kubernetes/control-plane/tasks/kubeadm-setup.yml#L81-L85

- name: kubeadm | Create kubeadm config
  template:
    src: "kubeadm-config.{{ kubeadmConfig_api_version }}.yaml.j2"
    dest: "{{ kube_config_dir }}/kubeadm-config.yaml"
    mode: 0640

Template file https://github.com/kubernetes-sigs/kubespray/blob/master/roles/kubernetes/control-plane/templates/kubeadm-controlplane.v1beta3.yaml.j2#L18-L19

controlPlane:
  localAPIEndpoint:
    advertiseAddress: {{ kube_apiserver_address }}
    bindPort: {{ kube_apiserver_port }}
  certificateKey: {{ kubeadm_certificate_key }}
nodeRegistration:

If I force kubeadm_certificate_key inside inventory/group_vars/k8s-cluster.yml, installation goes flawlessly

kubeadm_certificate_key: "dbfEcDFfF8Cc6fcaCDfBC0c4eb6baea4FDbbee4B3fc1A252c5bfe765de6FbEDc"
k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

nicolas-goudry commented 7 months ago

/remove-lifecycle rotten

nicolas-goudry commented 7 months ago

No promises here. I’ll take a look at it when I can.

nicolas-goudry commented 7 months ago

/assign

hgomez commented 7 months ago

Thanks a lot Nicolas

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

nicolas-goudry commented 4 months ago

/remove-lifecycle stale

lenow55 commented 4 months ago

I have the same problem and i fail at every cluster deploing.

fatal: [kubespray-control-0]: FAILED! => {"msg": "The conditional check 'kubeadm_certificate_key is not defined' failed. The error was: An unhandled exception occurred while templating '{{ lookup('password', credentials_dir + '/kubeadm_certificate_key.creds length=64 chars=hexdigits') | lower }}'. Error was a <class 'ansible.errors.AnsibleError'>, original message: An unhandled exception occurred while running the lookup plugin 'password'. Error was a <class 'FileExistsError'>, original message: [Errno 17] Файл существует: b'/home/lenow/actual_project/pgpool_deploy_yandex/kubespray/inventory/sample/credentials/4ed7348d1ba2c34c44925ec3609e16f62b8e8526.ansible_lockfile'. [Errno 17] Файл существует: b'/home/lenow/actual_project/pgpool_deploy_yandex/kubespray/inventory/sample/credentials/4ed7348d1ba2c34c44925ec3609e16f62b8e8526.ansible_lockfile'\n\nThe error appears to be in '/home/lenow/actual_project/pgpool_deploy_yandex/kubespray/roles/kubernetes/control-plane/tasks/kubeadm-setup.yml': line 210, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Set kubeadm certificate key\n  ^ here\n"}
nicolas-goudry commented 4 months ago

I have the same problem and i fail at every cluster deploing.

fatal: [kubespray-control-0]: FAILED! => {"msg": "The conditional check 'kubeadm_certificate_key is not defined' failed. The error was: An unhandled exception occurred while templating '{{ lookup('password', credentials_dir + '/kubeadm_certificate_key.creds length=64 chars=hexdigits') | lower }}'. Error was a <class 'ansible.errors.AnsibleError'>, original message: An unhandled exception occurred while running the lookup plugin 'password'. Error was a <class 'FileExistsError'>, original message: [Errno 17] Файл существует: b'/home/lenow/actual_project/pgpool_deploy_yandex/kubespray/inventory/sample/credentials/4ed7348d1ba2c34c44925ec3609e16f62b8e8526.ansible_lockfile'. [Errno 17] Файл существует: b'/home/lenow/actual_project/pgpool_deploy_yandex/kubespray/inventory/sample/credentials/4ed7348d1ba2c34c44925ec3609e16f62b8e8526.ansible_lockfile'\n\nThe error appears to be in '/home/lenow/actual_project/pgpool_deploy_yandex/kubespray/roles/kubernetes/control-plane/tasks/kubeadm-setup.yml': line 210, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Set kubeadm certificate key\n  ^ here\n"}

@lenow55 Could you please provide the steps to reproduce the issue? (details about your variables, the command you use to run the cluster.yml playbook, etc…)

I wasn’t yet able to reproduce this issue, that would help me to dig deeper. Thanks

lenow55 commented 4 months ago

@nicolas-goudry I'm sorry for a long answer.

Environment:

[kube_control_plane] kubespray-control-bastion kubespray-control-0 kubespray-control-1

[etcd] kubespray-control-bastion kubespray-control-0 kubespray-control-1

[kube_node] kubespray-control-bastion kubespray-control-0 kubespray-control-1 kubespray-postgres-0 kubespray-postgres-1 kubespray-pgpool-0 kubespray-pgbench-0

[k8s_cluster:children] kube_control_plane kube_node

[bastion] kubespray-control-bastion ansible_host=158.***.114.205

[postgres_cluster] kubespray-postgres-0 kubespray-postgres-1

[pgpool] kubespray-pgpool-0

[pgbench] kubespray-pgbench-0

settings

[all:vars] ansible_user=ubuntu ansible_ssh_private_key_file=~/.ssh/yandex_test_cluster ansible_ssh_common_args="-o ProxyCommand='ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -W %h:%p -q {{ ansible_user }}@158.***.114.205 {% if ansible_ssh_private_key_file is defined %}-i {{ ansible_ssh_private_key_file }}{% endif %}'"

[k8s_cluster:vars]

These two settings will put kubectl and admin.config in $inventory/artifacts

kubeconfig_localhost=True kubectl_localhost=True docker_rpm_keepcache=1 download_run_once=True


**To avoid this issue i use patch from [there](https://github.com/kubernetes-sigs/kubespray/pull/10523/files)** and it works fine
k8s-triage-robot commented 1 month ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 1 week ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

nicolas-goudry commented 1 week ago

/close

Duplicate of #10321 (there is far more details there, with a fix proposal)

k8s-ci-robot commented 1 week ago

@nicolas-goudry: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/kubespray/issues/9916#issuecomment-2252738158): >/close > >Duplicate of #10321 (there is far more details there, with a fix proposal) Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.