Open danielburrell opened 3 years ago
@danielburrell I can confirm that the module looks for the kubeconfig on whichever host it is being run on. It will work fine on a managed node and look for the kubeconfig on that node. There should be no need to copy the kubeconfig from the managed node to the controller. I am unable to reproduce the behavior you describe using:
ansible 2.10.8 community.kubernetes 1.2.1 openshift 0.12.0 kubernetes 12.0.1
I tried running the playbook on a managed node with a different user and it found the kubeconfig whether it was in the default location (~/.kube/config
) or if I moved it to a non-default location and specified the path to it using the kubeconfig
parameter. If I put a bunch of garbage in the kubeconfig on the managed node, the kubeconfig will fail to load because of that, so it's clearly finding the correct kubeconfig.
I'm not sure what to suggest other than to double check that the kubeconfig exists at the path you are specifying.
ansible-2.10.7 (this is the latest available version in my organization) ansible_base-2.10.6 openshift-0.11.2 (because of a bug in openshift 0.12.0) kubernetes-11.0.0 Not sure what the version of community.kubernetes is, I guess it is determined by the ansible version if it's bundled with ansible?
I'll try and come up with a test case repo, as I ran it again and I think the following output demonstrates that the file exists at the path:
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
failed: [10.50.52.94] (item={'name': 'traefik', 'quantity': 1}) => {"ansible_loop_var": "item", "attempts": 5, "changed": false, "item": {"name": "traefik", "quantity": 1}, "msg": "Could not find or access '/home/centos/.kube/config' on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"}
NO MORE HOSTS LEFT ********************************************************************************************************************************************************
PLAY RECAP ****************************************************************************************************************************************************************
10.50.52.94 : ok=65 changed=19 unreachable=0 failed=1 skipped=4 rescued=0 ignored=0
localhost : ok=11 changed=4 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
ssh -i ../cloudtls.pem centos@10.50.52.94
Last login: Wed Apr 14 08:14:11 2021 from <redacted>
[centos@10.50.52.94 ~]$ sudo su
[root@10.50.52.94 centos]# ls -lart /home/centos/.kube/config
-rw-r--r--. 1 centos root 1054 Apr 14 08:13 /home/centos/.kube/config
So at the point of failure, the playbook terminates, and if you log back into the box, the file is right there.
Side note, in the case the file is missing, the error message is;
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: If you are using a module and expect the file to exist on the remote, see the remote_src option
failed: [10.50.52.94] (item={'name': 'metrics-server', 'quantity': 1}) => {"ansible_loop_var": "item", "attempts": 5, "changed": false, "item": {"name": "metrics-server", "quantity": 1}, "msg": "Could not find or access '/home/centos/.kube/config' on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"}
Why does it say "on the Ansible Controller", the ansible controller is the machine you run ansible from correct? Anything else is a managed node or host. Am I wrong or is the message generic/misleading?
I'm confused because the error message about "the Ansible Controller" doesn't exist in this repo. Can you post the output of a failing playbook with full verbosity (-vvvv
)?
<10.50.52.94> ESTABLISH SSH CONNECTION FOR USER: centos
<10.50.52.94> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o 'IdentityFile="/home/daniel/projects/prom/cloudtls.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="centos"' -o ConnectTimeout=10 -o ControlPath=/home/daniel/.ansible/cp/b5fb5e1ade 10.50.52.94 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /home/centos/.ansible/tmp `"&& mkdir "` echo /home/centos/.ansible/tmp/ansible-tmp-1618407874.488945-17766-127442268166527 `" && echo ansible-tmp-1618407874.488945-17766-127442268166527="` echo /home/centos/.ansible/tmp/ansible-tmp-1618407874.488945-17766-127442268166527 `" ) && sleep 0'"'"''
<10.50.52.94> (0, b'ansible-tmp-1618407874.488945-17766-127442268166527=/home/centos/.ansible/tmp/ansible-tmp-1618407874.488945-17766-127442268166527\n', b'OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017\r\ndebug1: Reading configuration data /home/daniel/.ssh/config\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 58: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 4 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 16322\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 0\r\n')
<10.50.52.94> ESTABLISH SSH CONNECTION FOR USER: centos
<10.50.52.94> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o 'IdentityFile="/home/daniel/projects/prom/cloudtls.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="centos"' -o ConnectTimeout=10 -o ControlPath=/home/daniel/.ansible/cp/b5fb5e1ade 10.50.52.94 '/bin/sh -c '"'"'rm -f -r /home/centos/.ansible/tmp/ansible-tmp-1618407874.488945-17766-127442268166527/ > /dev/null 2>&1 && sleep 0'"'"''
<10.50.52.94> (0, b'', b'OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017\r\ndebug1: Reading configuration data /home/daniel/.ssh/config\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 58: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 4 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 16322\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 0\r\n')
The full traceback is:
Traceback (most recent call last):
File "/home/daniel/.pex/installed_wheels/c708016249a31ecd1c4fc3c5b03d3dd85e595252/ansible-2.10.7-py3-none-any.whl/ansible_collections/community/kubernetes/plugins/action/k8s_info.py", line 51, in run
kubeconfig = self._find_needle('files', kubeconfig)
File "/home/daniel/.pex/installed_wheels/04c26471cb05787fcd8372d2f2bea63afb042678/ansible_base-2.10.6-py3-none-any.whl/ansible/plugins/action/__init__.py", line 1232, in _find_needle
return self._loader.path_dwim_relative_stack(path_stack, dirname, needle)
File "/home/daniel/.pex/installed_wheels/04c26471cb05787fcd8372d2f2bea63afb042678/ansible_base-2.10.6-py3-none-any.whl/ansible/parsing/dataloader.py", line 327, in path_dwim_relative_stack
raise AnsibleFileNotFound(file_name=source, paths=[to_native(p) for p in search])
ansible.errors.AnsibleFileNotFound: Could not find or access '/home/centos/.kube/config' on the Ansible Controller.
If you are using a module and expect the file to exist on the remote, see the remote_src option
failed: [10.50.52.94] (item={'name': 'traefik', 'quantity': 1}) => {
"ansible_loop_var": "item",
"attempts": 5,
"changed": false,
"item": {
"name": "traefik",
"quantity": 1
},
"msg": "Could not find or access '/home/centos/.kube/config' on the Ansible Controller.\nIf you are using a module and expect the file to exist on the remote, see the remote_src option"
}
OK, I'm pretty sure what's going on here is you have an old version of community.kubernetes. I would suggest upgrading to 1.2.1 and seeing if that fixes your problem.
is it possible to select a version for that library? I thought it was determined/bundled with the ansible binary. Is there a way to check which version I'm using?
Since you're using ansible 2.10 you should be able to do:
$ ansible-galaxy collection list | grep kubernetes
You can install the latest version by doing:
$ ansible-galaxy collection install community.kubernetes
More info: https://docs.ansible.com/ansible/latest/user_guide/collections_using.html
@danielburrell have you been able to test this after upgrading? Could you report if it is working for you?
sorry for the delay, so it seems that when building our 'pex' we have been using the python dependency known as 'ansible' which bundles some of the community collections including the kubernetes and helm ones, and I cannot see a more up to date version of this ansible
package in pypi, I note that the community package that's bundled with the version I'm using is not the latest though.
So this means that if I want to upgrade the community collections, I'll have to wait for a new ansible
release, or else migrate to using ansible-core
(previously known as ansible-base
?) and find a way to airgap the community collections.
I am going to try to do the latter as it seems to me that the ansible-core
+ collections is being maintained more than the ansible
package, and if I can get it to work it will mean the ability to pick up bugfixes in collections.
The supply of collections is a bit more tricky though and I'm not sure how the installation of these collections interacts with custom virtual environments. (some of our tasks specify the python version from a particular venv).
I think using ansible to deploy software behind an airgap (having no public internet connection) is a bit trickier so, sorry if it takes a while longer to confirm.
SUMMARY
When k8s_info module runs on a non-controller host, it looks for a kubeconfig file on the controller host, but returns "invalid kube-config file" error even though the kubeconfig file on the controller is valid.
To demonstrate this I have 2 scenarios to be compared; the first scenario (CI) seems to work by co-incidence, the second scenario (local development) reveals the bug.
CI Scenario:
~centos/.kube/config
is copied to the control box at the same location~centos/.kube/config
near the end of the play.hosts: server
(target), specifying the kubeconfig file, I am able to confirm certain objects exist in my cluster. This works as intended.whoami
on all boxes would returncentos
and ansible_user would also returncentos
. The 2 boxes are basically the same, just different 'purpose'. This is the only scenario that works.Local Development Scenario
daniel
there is a target box configured the same way as the CI scenario (with a centos user).ansible-playbook .... -u centos
, the ssh user iscentos
, but I run the installer asdaniel
. sowhoami
on the localhost would returndaniel
and anywhere else would becentos
. As above the kubeconfig file is in its place on the target machine, and as before gets copied to the control box. Note in the CI scenario, the copy was to an identical folder,~centos/.kube/config
, but this time in this scenario, it's copied to~daniel/.kube/config
The documentation does not state where the kubeconfig file must be located, nor does it state that the k8s_info must be run on the control box. So up till now I had no reason to think anything was wrong. When my k8s_info task ran on the target machine, I assumed it was using the target machine kubeconfig (not the controlbox kubeconfig).
When I try to run a k8s_info action in this scenario with
hosts: server
(target) specifying the kubeconfig file as~centos/.kube/config
then it says the file cannot be found on the AnsibleControl machine (of course it's now located at~daniel/.kube/confing
on the control machineThis suggests that regardless of the
hosts
, the role is expecting to find the kubeconfig file on the control machine. Can you confirm this is the case?Assuming this is true, if I tell the installer to use
~daniel/.kube/config
(which exists on the control machine), withhosts: server
then it tells me that the config isn't valid!The only scenario that works with
hosts: server
is if my target and control both have a kubeconfig file in the same location.This seems to be a bug as
Can you clarify the following:
localhost
only? or anywhere?kubeconfig
file is used, control? or the currentinventory_host
?This is all very strange, as this causes the playbook to fail, and I am able to cat the file, and it's perfectly valid, and works with kubectl.
Any ideas?
ISSUE TYPE
COMPONENT NAME
k8s_info
ANSIBLE VERSION
CONFIGURATION
OS / ENVIRONMENT
centos 7.9
STEPS TO REPRODUCE
EXPECTED RESULTS
I would have expected my kubeconfig file to be considered valid.
ACTUAL RESULTS