Capgemini / kubeform

Form your :boat: Kubernetes :anchor: cluster anywhere using CoreOS, Terraform and Ansible
https://capgemini.github.io/kubeform
MIT License
327 stars 56 forks source link

Ansible gives error on AWS after terraform apply. No such file or directory #138

Closed aldarund closed 8 years ago

aldarund commented 8 years ago

According to docs after terraform apply im running ansible-playbook, but it fails with no such file or directory.

[dm@localhost kubeform]$ ansible-playbook -u core --ssh-common-args="-F /tmp/kubeform/terraform/aws/public-cloud/ssh.config -i /tmp/kubeform/terraform/aws/public-cloud/id_rsa -q" --inventory-file=inventory site.yml -e kube_apiserver_vip=$(cd /tmp/kubeform/terraform/aws/public-cloud && terraform output master_elb_hostname)
[DEPRECATION WARNING]: Instead of sudo/sudo_user, use become/become_user and make sure become_method is 'sudo' (default).
This feature will be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.

PLAY [all:!role=bastion] *******************************************************

TASK [Wait for ssh port to become available from bastion server.] **************
skipping: [kube-worker-0]
skipping: [kube-edge-router-0]
skipping: [kube-master-0]

TASK [Wait for port 22 to become available from local server.] *****************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 2] No such file or directory
fatal: [kube-master-0 -> localhost]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Traceback (most recent call last):\n  File \"<stdin>\", line 114, in <module>\n  File \"<stdin>\", line 28, in invoke_module\n  File \"/usr/lib64/python2.7/subprocess.py\", line 711, in __init__\n    errread, errwrite)\n  File \"/usr/lib64/python2.7/subprocess.py\", line 1327, in _execute_child\n    raise child_exception\nOSError: [Errno 2] No such file or directory\n", "module_stdout": "", "msg": "MODULE FAILURE", "parsed": false}
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 2] No such file or directory
fatal: [kube-edge-router-0 -> localhost]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Traceback (most recent call last):\n  File \"<stdin>\", line 114, in <module>\n  File \"<stdin>\", line 28, in invoke_module\n  File \"/usr/lib64/python2.7/subprocess.py\", line 711, in __init__\n    errread, errwrite)\n  File \"/usr/lib64/python2.7/subprocess.py\", line 1327, in _execute_child\n    raise child_exception\nOSError: [Errno 2] No such file or directory\n", "module_stdout": "", "msg": "MODULE FAILURE", "parsed": false}
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno 2] No such file or directory
fatal: [kube-worker-0 -> localhost]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Traceback (most recent call last):\n  File \"<stdin>\", line 114, in <module>\n  File \"<stdin>\", line 28, in invoke_module\n  File \"/usr/lib64/python2.7/subprocess.py\", line 711, in __init__\n    errread, errwrite)\n  File \"/usr/lib64/python2.7/subprocess.py\", line 1327, in _execute_child\n    raise child_exception\nOSError: [Errno 2] No such file or directory\n", "module_stdout": "", "msg": "MODULE FAILURE", "parsed": false}
enxebre commented 8 years ago

Hey @aldarund I'm not able to reproduce that. Can you please tell us the version of ansible you are running? Also make sure you've exported export TF_VAR_STATE_ROOT=/tmp/kubeform/terraform/aws/public-cloud and you have cloned the project on /tmp/kubeform Thanks!

aldarund commented 8 years ago

@enxebre export was done. did again -> nothing changed. project cloned too into /tmp/kubeform.

[dm@localhost kubeform]$ ansible-playbook --version
ansible-playbook 2.1.0.0
  config file = /tmp/kubeform/ansible.cfg
  configured module search path = ['./library']
enxebre commented 8 years ago

Hey @aldarund we can't guarantee success with 2.1.0.0 at the minute as this has not been tested. We rely on 2.0.2.0 at the minute. https://github.com/Capgemini/kubeform/blob/master/requirements.txt#L1 We'll be providing a one shot deployment handling this dependencies soon If you still having issues with 2.0.2.0, it'd be useful if you add the -vvv flag into the ansible command so we can get more feedback

aldarund commented 8 years ago

@enxebre ok, i see.will try with 2.0.2.0 , but need to build that sources since in epel for centos 7 there only 2.1 available.

aldarund commented 8 years ago

@enxebre tried with 2.0.2.0. Different error now

TASK [Wait for port 22 to become available from local server.] *****************
ok: [kube-worker-0 -> localhost]
ok: [kube-edge-router-0 -> localhost]
ok: [kube-master-0 -> localhost]

PLAY [bootstrap coreos hosts] **************************************************

TASK [coreos_timezone : include] ***********************************************
included: /tmp/kubeform/roles/coreos_timezone/tasks/timezone.yml for kube-worker-0, kube-edge-router-0, kube-master-0

TASK [coreos_timezone : check if timezone is already set correctly] ************
fatal: [kube-edge-router-0]: UNREACHABLE! => {"changed": false, "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
fatal: [kube-master-0]: UNREACHABLE! => {"changed": false, "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
fatal: [kube-worker-0]: UNREACHABLE! => {"changed": false, "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
aldarund commented 8 years ago

And im certainly can reach it via ssh. E.g

[dm@localhost kubeform]$ ssh  52.48.0.203
The authenticity of host '52.48.0.203 (52.48.0.203)' can't be established.
ECDSA key fingerprint is e3:60:85:0c:42:d3:6d:33:8f:64:09:04:4a:4a:ab:ff.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '52.48.0.203' (ECDSA) to the list of known hosts.
enxebre commented 8 years ago

hey @aldarund seems we've got an out of date command in the docs It shold be: ansible-playbook -u core --ssh-common-args="-i terraform/aws/public-cloud/id_rsa -q" --inventory-file=inventory site.yml So I think It's probably -F /tmp/kubeform/terraform/aws/public-cloud/ssh.config which is making it fail for you.

You can also test it running something like ansible kube-worker-0 -m setup -i inventory/ --private-key=terraform/aws/public-cloud/id_rsa -u core --list-hosts that should return all the extra vars for kube-worker-0

You should also be able to access via ssh by adding the ssh-add terraform/aws/public-cloud/id_rsa and ssh core@52.48.0.203 Hope that helps

aldarund commented 8 years ago

@enxebre Now new error :)

TASK [coreos_timezone : setup new timezone] ************************************
fatal: [kube-worker-0]: FAILED! => {"changed": false, "failed": true, "module_stderr": "/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)\n/home/core/pypy/bin/pypy: /lib64/libssl.so.1.0.0: no version information available (required by /home/core/pypy/bin/pypy)\n/home/core/pypy/bin/pypy: /lib64/libcrypto.so.1.0.0: no version information available (required by /home/core/pypy/bin/pypy)\n/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)\nTraceback (most recent call last):\n  File \"app_main.py\", line 75, in run_toplevel\n  File \"app_main.py\", line 636, in run_it\n  File \"<stdin>\", line 2422, in <module>\n  File \"<stdin>\", line 401, in main\nTypeError: set_fs_attributes_if_different() takes exactly 3 arguments (4 given)\n", "module_stdout": "", "msg": "MODULE FAILURE", "parsed": false}
fatal: [kube-master-0]: FAILED! => {"changed": false, "failed": true, "module_stderr": "/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)\n/home/core/pypy/bin/pypy: /lib64/libssl.so.1.0.0: no version information available (required by /home/core/pypy/bin/pypy)\n/home/core/pypy/bin/pypy: /lib64/libcrypto.so.1.0.0: no version information available (required by /home/core/pypy/bin/pypy)\n/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)\nTraceback (most recent call last):\n  File \"app_main.py\", line 75, in run_toplevel\n  File \"app_main.py\", line 636, in run_it\n  File \"<stdin>\", line 2422, in <module>\n  File \"<stdin>\", line 401, in main\nTypeError: set_fs_attributes_if_different() takes exactly 3 arguments (4 given)\n", "module_stdout": "", "msg": "MODULE FAILURE", "parsed": false}
fatal: [kube-edge-router-0]: FAILED! => {"changed": false, "failed": true, "module_stderr": "/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)\n/home/core/pypy/bin/pypy: /lib64/libssl.so.1.0.0: no version information available (required by /home/core/pypy/bin/pypy)\n/home/core/pypy/bin/pypy: /lib64/libcrypto.so.1.0.0: no version information available (required by /home/core/pypy/bin/pypy)\n/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)\nTraceback (most recent call last):\n  File \"app_main.py\", line 75, in run_toplevel\n  File \"app_main.py\", line 636, in run_it\n  File \"<stdin>\", line 2422, in <module>\n  File \"<stdin>\", line 401, in main\nTypeError: set_fs_attributes_if_different() takes exactly 3 arguments (4 given)\n", "module_stdout": "", "msg": "MODULE FAILURE", "parsed": false}
aldarund commented 8 years ago

nvm, it was my a bit wrongly builded ansible. ansible finished succesfully

enxebre commented 8 years ago

Cool, I'm closing this issue them. Feel free to have a look at this video demo https://www.youtube.com/watch?v=Ejc5rKTzHiQ on how to start to play with the cluster and edge router ingress scaling. Thanks for your feedback!