stackhpc / ansible-collection-cephadm

Apache License 2.0
16 stars 11 forks source link

cluster.yml - 'dict object' has no attribute 'nodename' #155

Open PC-Admin opened 1 month ago

PC-Admin commented 1 month ago

This is a common one I see when attempting to first bootstrap a cluster with this collection, and template the main cluster.yml spec...

TASK [stackhpc.cephadm.cephadm : Template out cluster.yml] *************************************************************************************************************
fatal: [storage-13-09002]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'nodename'"}

It's this task in question: https://github.com/stackhpc/ansible-collection-cephadm/blob/fc73a5d7a5adbef20d9f044a92ab5dff674d8607/roles/cephadm/tasks/bootstrap.yml#L76

Which seems to refer to this (missing?) ansible fact: https://github.com/stackhpc/ansible-collection-cephadm/blob/fc73a5d7a5adbef20d9f044a92ab5dff674d8607/roles/cephadm/templates/cluster.yml.j2#L4

Strangely this variable is definitely defined before the bootstrap role is called:

TASK [cephadm-setup : DEBUG - Display nodename] ***************************************************************************************************************************************
...
ok: [storage-13-09002] => {
    "ansible_facts.nodename": "storage-13-09002"
}
ok: [storage-13-09004] => {
    "ansible_facts.nodename": "storage-13-09004"
}
ok: [storage-13-09006] => {
    "ansible_facts.nodename": "storage-13-09006"
}

This bug seems to occur consistently either on the first run, or if -e cephadm_bootstrap=true is set. Otherwise it doesn't occur.

Happy to provide more context if needed.

PC-Admin commented 1 month ago

This might be a non-issue that's actually being caused by me having a few expected hosts down during bootstrap, I'll keep an eye on it...