cloudera / cloudera-playbook

Cloudera deployment automation with Ansible
Apache License 2.0
198 stars 187 forks source link

AnsibleUndefinedVariable: 'dict object' has no attribute #22

Closed kuldeepkulkarni09 closed 5 years ago

kuldeepkulkarni09 commented 5 years ago

Basically running playbook from CM host and getting below error, tried lot of things however could not figure out what is going wrong, if it’s problem with variables in group_var or am I missing something.

TASK [scm : file] **********************************************************************************************************************************************************************************************************************************************************************************************************************************************************
changed: [kkulkani-cdhkerberos-1]

TASK [scm : Import KDC admin credentials] **********************************************************************************************************************************************************************************************************************************************************************************************************************************
ok: [kkulkani-cdhkerberos-1]

TASK [scm : Wait for agent heartbeats] *************************************************************************************************************************************************************************************************************************************************************************************************************************************
Pausing for 30 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
Press 'C' to continue the play or 'A' to abort
ok: [kkulkani-cdhkerberos-1]

TASK [scm : Prepare CMS template] ******************************************************************************************************************************************************************************************************************************************************************************************************************************************
fatal: [kkulkani-cdhkerberos-1]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute u'kkulkani-cdhkerberos-1'"}
kuldeepkulkarni09 commented 5 years ago

@roczei @jamestyj

rouxero commented 5 years ago

Same problem here!

ok: [xxx.xxx.xxx.xxx]

TASK [scm : Wait for agent heartbeats] ******************************************************************************************************************************
Pausing for 30 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [xxx.xxx.xxx.xxx]

TASK [scm : Prepare CMS template] ***********************************************************************************************************************************
fatal: [xxx.xxx.xxx.xxx -> localhost]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute u'xxx.xxx.xxx.xxx'"}
        to retry, use: --limit @/opt/ansible/cloudera-playbook/site.retry

PLAY RECAP **********************************************************************************************************************************************************
xxx.xxx.xxx.xxx              : ok=21   changed=2    unreachable=0    failed=1
roczei commented 5 years ago

Hi @rouxero, @kuldeepkulkarni09,

I have just tested the latest git repo. It works for me:

TASK [scm : Wait for agent heartbeats] *********************************************************************************
Pausing for 30 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [playbook-test-1.gce.cloudera.com]

TASK [scm : Prepare CMS template] **************************************************************************************
changed: [playbook-test-1.gce.cloudera.com -> localhost]

TASK [scm : Setup the Cloudera Management Services (CMS)] **************************************************************
ok: [playbook-test-1.gce.cloudera.com -> localhost]

TASK [scm : debug] *****************************************************************************************************
ok: [playbook-test-1.gce.cloudera.com] => {

Could you please share with me your ansible_hosts configuration? Maybe something is different in your case.

alcher commented 5 years ago

Hi @roczei @jrkinley @Jimvin

encounter same issue and used the latest git repo. I set krb5 type to none, ad and mit - same output.

Debug output `TASK [scm : Prepare CMS template] **** task path: /home/admin/cloudera-playbook/roles/scm/tasks/cms.yml:9

ESTABLISH LOCAL CONNECTION FOR USER: admin EXEC /bin/sh -c 'echo ~admin && sleep 0' EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/admin/.ansible/tmp/ansible-tmp-1557208095.46-187677447756307 `" && echo ansible-tmp-1557208095.46-187677447756307="` echo /home/admin/.ansible/tmp/ansible-tmp-1557208095.46-187677447756307 `" ) && sleep 0' EXEC /bin/sh -c 'rm -f -r /home/admin/.ansible/tmp/ansible-tmp-1557208095.46-187677447756307/ > /dev/null 2>&1 && sleep 0' fatal: [server-admin-01 -> localhost]: FAILED! => { "changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute u'server-admin-01'" } to retry, use: --limit @/home/admin/cloudera-playbook/site.retry` **Hostfile** `# Note for AWS: 'Public DNS' name is too long for ansible_host, use 'Public IP' (https://github.com/ansible/ansible/issues/11536) [scm_server] server-admin-01 license_file=/home/admin/cloudera_license.txt [db_server] server-admin-01 [utility_servers:children] scm_server db_server [krb5_server] server-admin-01 default_realm=CLOUDERA.COM [master_servers] server-admin-02 host_template=HostTemplate-Master1 server-data-01 host_template=HostTemplate-Master2 server-data-01 host_template=HostTemplate-Master3 [worker_servers] server-data-01 server-data-02 server-admin-02 [worker_servers:vars] host_template=HostTemplate-Workers [cdh_servers:children] utility_servers master_servers krb5_server worker_servers [all:vars] ansible_user=admin`
anis016 commented 5 years ago

Hi @roczei ! any update on this issue. I am also getting the same. It seems to come after adding this --

[worker_servers:vars] host_template=HostTemplate-Workers

roczei commented 5 years ago

Hi All,

I was able to reproduce this issue with alcher's ansible_hosts configuration. The root cause is that hostnames are not fully qualified domain names (FQDN). Here is an example for a FQDN: playbook-1.gce.cloudera.com

I have modified them to FQDNs (example: playbook-1 ---> playbook-1.gce.cloudera.com).

This was the next error because the HDFS-HTTPFS-1 was not there:

> TASK [cdh : Import cluster template] *********************************************************************************************
> fatal: [playbook-1.gce.cloudera.com]: FAILED! => {"changed": false, "connection": "close", "content": "{\n  \"message\" : \"The following roles are not resolved by any host: HDFS-HTTPFS-1.\"\n}", "content_type": "application/json", "date": "Tue, 28 May 2019 21:06:46 GMT", "elapsed": 2, "expires": "Thu, 01-Jan-1970 00:00:00 GMT", "json": {"message": "The following roles are not resolved by any host: HDFS-HTTPFS-1."}, "msg": "Status code was 400 and not [200]: HTTP Error 400: Bad Request", "redirected": false, "server": "Jetty(6.1.26.cloudera.4)", "set_cookie": "CLOUDERA_MANAGER_SESSIONID=x4l8jw2qr0edxb8nb3k28d3g;Path=/;HttpOnly", "status": 400, "url": "http://playbook-1.gce.cloudera.com:7180/api/v19/cm/importClusterTemplate?addRepositories=true"}

Here is a working example, please test this (you need just modify the hostnames and use FQDNs everywhere):

[scm_server]
playbook-test-1.gce.cloudera.com license_file=/root/cloudera_license.txt

[db_server]
playbook-test-1.gce.cloudera.com

[krb5_server]
playbook-test-1.gce.cloudera.com        default_realm=EXAMPLE.COM

[utility_servers:children]
scm_server
db_server
krb5_server

[gateway_servers]
playbook-test-1.gce.cloudera.com        host_template=HostTemplate-Gateway role_ref_names=HDFS-HTTPFS-1

[master_servers]
playbook-test-2.gce.cloudera.com        host_template=HostTemplate-Master1
playbook-test-3.gce.cloudera.com        host_template=HostTemplate-Master2
playbook-test-4.gce.cloudera.com        host_template=HostTemplate-Master3

[worker_servers]
playbook-test-5.gce.cloudera.com
playbook-test-6.gce.cloudera.com
playbook-test-7.gce.cloudera.com

[worker_servers:vars]
host_template=HostTemplate-Workers

[cdh_servers:children]
utility_servers
gateway_servers
master_servers
worker_servers
alcher commented 5 years ago

Confirmed working when using fqdn

roczei commented 5 years ago

I have just updated the README.md file in the git repository and highlighted that fully qualified domain name (FQDN) is mandatory in the ansible_hosts file.

dbeech commented 5 years ago

Hello. This issue would occur if the hostnames in inventory file are different to the hostnames reported by each node's Cloudera Manager agent. Using short names instead of FQDNs is one possibility, and DNS configuration (or misconfiguration) is another.

Closing for now -- if you have further issues, please re-open and provide debug logs of your playbook run, specifically the scm role (and use -vv verbosity settings)