ansible-collections / kubernetes.core

The collection includes a variety of Ansible content to help automate the management of applications in Kubernetes and OpenShift clusters, as well as the provisioning and maintenance of clusters themselves.
Other
216 stars 135 forks source link

Cannot be used with EKS #461

Open olahouze opened 2 years ago

olahouze commented 2 years ago
SUMMARY

I can't use the module to administer AWS EKS clusters

ISSUE TYPE
COMPONENT NAME

kubernetes.core.k8s_exec

ANSIBLE VERSION
ansible 2.9.6
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/<my-user>/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  executable location = /usr/bin/ansible
  python version = 3.8.10 (default, Mar 15 2022, 12:22:08) [GCC 9.4.0]
COLLECTION VERSION

ansible-galaxy collection list don'work on my computer I had collections :

ll /home//.ansible/collections/ansible_collections/community/kubernetes/ total 136 drwxrwxr-x 7 4096 mai 12 16:13 ./ drwxrwxr-x 3 4096 mai 12 16:13 ../ -rw------- 1 36 mai 12 16:13 bindep.txt -rw------- 1 16344 mai 12 16:13 CHANGELOG.rst drwxrwxr-x 2 4096 mai 12 16:13 changelogs/ -rw------- 1 107 mai 12 16:13 codecov.yml -rw------- 1 3311 mai 12 16:13 CONTRIBUTING.md drwxrwxr-x 3 4096 mai 12 16:13 docs/ -rw------- 1 5649 mai 12 16:13 FILES.json drwxrwxr-x 3 4096 mai 12 16:13 .github/ -rw------- 1 190 mai 12 16:13 .gitignore -rw------- 1 35148 mai 12 16:13 LICENSE -rw------- 1 1205 mai 12 16:13 Makefile -rw------- 1 1224 mai 12 16:13 MANIFEST.json drwxrwxr-x 2 4096 mai 12 16:13 meta/ -rw------- 1 6396 mai 12 16:13 README.md -rw------- 1 35 mai 12 16:13 requirements.txt -rw------- 1 50 mai 12 16:13 setup.cfg -rw------- 1 20 mai 12 16:13 test-requirements.txt drwxrwxr-x 3 4096 mai 12 16:13 tests/ -rw------- 1 264 mai 12 16:13 .yamllint


##### CONFIGURATION
No result to commande : **ansible-config dump --only-changed**

##### OS / ENVIRONMENT
Linux XXXX 5.13.0-41-generic #46~20.04.1-Ubuntu

##### STEPS TO REPRODUCE
My parameters :

**kubeconfig :** 
```yaml
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: LS0tL.....Qo=
    server: https://48F2BD40CC5BBFCECD8DC24F8DXXXXXX.yl4.eu-west-3.eks.amazonaws.com
  name: arn:aws:eks:eu-west-3:XXXXXXXXXXXX:cluster/aws-common-development
contexts:
- context:
    cluster: arn:aws:eks:eu-west-3:XXXXXXXXXXXX:cluster/aws-common-development
    namespace: test
    user: arn:aws:eks:eu-west-3:XXXXXXXXXXXX:cluster/aws-common-development
  name: arn:aws:eks:eu-west-3:XXXXXXXXXXXX:cluster/aws-common-development
current-context: arn:aws:eks:eu-west-3:XXXXXXXXXXXX:cluster/aws-common-development
kind: Config
preferences: {}
users:
- name: arn:aws:eks:eu-west-3:XXXXXXXXXXXX:cluster/aws-common-development
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - --region
      - eu-west-3
      - eks
      - get-token
      - --cluster-name
      - aws-common-development
      command: aws
      env:
      - name: AWS_PROFILE
        value: common
      interactiveMode: IfAvailable
      provideClusterInfo: false

Playbook of test

- name: "Test k8s commande"
  kubernetes.core.k8s_exec:
    kubeconfig: "{{kubeconfig}}"
    username: "{{kubectl_username}}"
    context: "{{kubectl_context}}"
    namespace: test
    pod: network-tools
    command: "ip a"

With parameters :


kubeconfig: "/home/<my-user>/.kube/config"
kubectl_context: arn:aws:eks:eu-west-3:XXXXXXXXXXXX:cluster/aws-common-development
kubectl_username: arn:aws:eks:eu-west-3:XXXXXXXXXXXX:cluster/aws-common-development
kubectl_namespace: test
EXPECTED RESULTS

I would like the command to run on the pod and return the result

ACTUAL RESULTS

Error message on ansible

ASK [test : Test k8s commande] ***************************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Failed to get client due to 403\nReason: Forbidden\nHTTP response headers: HTTPHeaderDict({'Audit-Id': 'f359f293-3066-48d5-997f-7d2a34563c43', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '733e9229-3fcc-40db-b964-912b55e96e9e', 'X-Kubernetes-Pf-Prioritylevel-Uid': '2ae196cc-4de5-4a2d-aac5-431bb93d0b76', 'Date': 'Fri, 13 May 2022 14:55:13 GMT', 'Content-Length': '189'})\nHTTP response body: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"forbidden: User \\\\\"system:anonymous\\\\\" cannot get path \\\\\"/apis\\\\\"\",\"reason\":\"Forbidden\",\"details\":{},\"code\":403}\\n'\nOriginal traceback: \n  File \"/usr/local/lib/python3.8/dist-packages/kubernetes/dynamic/client.py\", line 55, in inner\n    resp = func(self, *args, **kwargs)\n\n  File \"/usr/local/lib/python3.8/dist-packages/kubernetes/dynamic/client.py\", line 270, in request\n    api_response = self.client.call_api(\n\n  File \"/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py\", line 348, in call_api\n    return self.__call_api(resource_path, method,\n\n  File \"/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py\", line 180, in __call_api\n    response_data = self.request(\n\n  File \"/usr/local/lib/python3.8/dist-packages/kubernetes/client/api_client.py\", line 373, in request\n    return self.rest_client.GET(url,\n\n  File \"/usr/local/lib/python3.8/dist-packages/kubernetes/client/rest.py\", line 240, in GET\n    return self.request(\"GET\", url,\n\n  File \"/usr/local/lib/python3.8/dist-packages/kubernetes/client/rest.py\", line 234, in request\n    raise ApiException(http_resp=r)\n"}
gravesm commented 2 years ago

@olahouze Are you able to successfully use kubectl to run the command using the same kubeconfig?

olahouze commented 2 years ago

Yes no probleme with kubctl and same file kubeconfig

karthik-krishnaswamy17 commented 2 years ago

Did this get solved? I too get the same error.

gravesm commented 2 years ago

I am not able to reproduce this on my end. Are you able to successfully retrieve your ExecCredential with your exec based auth command? In your case, it should be:

$ AWS_PROFILE=common aws --region eu-west-3 eks get-token --cluster-name aws-common-development
jbkc85 commented 2 years ago

I am having these same issues, and its VERY hard to reproduce. I am using a service account, so my kubeconfig is token based.

In random namespaces, I encounter random issues, even though permission are set accordingly. In other namespaces, things are perfectly fine. Here is an example:

{"changed": false, "msg": "Scale request failed: 403\nReason: Forbidden\nHTTP response headers: HTTPHeaderDict({'Audit-Id': '253def2c-33e7-4be1-9ad1-d317de42297e', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '03379d20-3ead-4315-a979-5544ad552e44', 'X-Kubernetes-Pf-Prioritylevel-Uid': '95596f23-5203-4fdd-babf-ac3ccd0dc14c', 'Date': 'Thu, 04 Aug 2022 14:42:48 GMT', 'Content-Length': '376'})\nHTTP response body: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"deployments.apps \\\\\"processor\\\\\" is forbidden: User \\\\\"system:serviceaccount:default:deployer\\\\\" cannot patch resource \\\\\"deployments/scale\\\\\" in API group \\\\\"apps\\\\\" in the namespace \\\\\"namespace\\\\\"\",\"reason\":\"Forbidden\",\"details\":{\"name\":\"processor\",\"group\":\"apps\",\"kind\":\"deployments\"},\"code\":403}

Thing is, I use terraform which sets the permissions for everyone namespace the SAME for the deployer user. It is impossible that in one namespace it works, in the next namespace I can't get, and in a third namespace I cant patch the scale.

Also, using kubectl - it works just fine. Using ansible via the 'k8s_scale'? Fails randomly.

    - name: scale processor up
      kubernetes.core.k8s_scale:
        api_version: v1
        kind: Deployment
        name: processor
        namespace: "{{ namespace }}"
        current_replicas: 0
        replicas: 1
        wait: false
      delegate_to: 127.0.0.1
tima commented 2 years ago

We need some additional information in trying to reproduce this.

Also, community.kubernetes has been deprecated for about 2 years and has be removed from the latest ansible community distros. It shouldn't be in conflict with kubernetes.core, but it's worth eliminating.

jbkc85 commented 2 years ago
ansible-playbook [core 2.11.6]
  config file = None
  python version = 3.8.10 (default, Jun 22 2022, 20:18:18) [GCC 9.4.0]
  jinja version = 2.10.1
  libyaml = True

if the kubernetes.core is in the ansible_collections then the version is 1.2.1 based on the change log. Let me know if there is a better way to check this.

gravesm commented 2 years ago

@jbkc85 You can use ansible-galaxy collection list kubernetes.core to get the version.

jdftapi commented 2 years ago

I am encountering a similar problem when trying to use k8s_exec, NOT on EKS though: I am using a ServiceAccount token to access the api, provided in a kubeconfig file. Using this same file, I am able to create a pod in the same namespace. My exec task

- name: create target database in db cluster
  kubernetes.core.k8s_exec:
    kubeconfig: /home/user/migration/kubeconfig
    namespace: '{{ migration_namespace }}'
    pod: 'db-client-customer-st-{{ customer_id }}'
    command: 'mysql -hcluster1-pxc -uroot -p$MYSQL_DB_PASSWORD -e "CREATE DATABASE {{ db-name }}"'

generates the following error (using -vvvv)

The full traceback is:
  File "/tmp/ansible_kubernetes.core.k8s_exec_payload_ukpfy3hi/ansible_kubernetes.core.k8s_exec_payload.zip/ansible_collections/kubernetes/core/plugins/modules/k8s_exec.py", line 193, in execute_module
  File "/usr/local/lib/python3.7/dist-packages/kubernetes/stream/stream.py", line 35, in _websocket_request
    return api_method(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/kubernetes/client/api/core_v1_api.py", line 994, in connect_get_namespaced_pod_exec
    return self.connect_get_namespaced_pod_exec_with_http_info(name, namespace, **kwargs)  # noqa: E501
  File "/usr/local/lib/python3.7/dist-packages/kubernetes/client/api/core_v1_api.py", line 1115, in connect_get_namespaced_pod_exec_with_http_info
    collection_formats=collection_formats)
  File "/usr/local/lib/python3.7/dist-packages/kubernetes/client/api_client.py", line 353, in call_api
    _preload_content, _request_timeout, _host)
  File "/usr/local/lib/python3.7/dist-packages/kubernetes/client/api_client.py", line 184, in __call_api
    _request_timeout=_request_timeout)
  File "/usr/local/lib/python3.7/dist-packages/kubernetes/stream/ws_client.py", line 525, in websocket_call
    raise ApiException(status=0, reason=str(e))
fatal: [migration-test.ftapi.com]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "api_key": null,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
            "command": "mysql -hcluster1-pxc -uroot -p$MYSQL_DB_PASSWORD -e \"CREATE DATABASE customer_st_123456789\"",
            "container": null,
            "context": null,
            "host": null,
            "impersonate_groups": null,
            "impersonate_user": null,
            "kubeconfig": "/home/andy/migration/kubeconfig",
            "namespace": "migration-vm-to-cluster",
            "no_proxy": null,
            "password": null,
            "persist_config": null,
            "pod": "db-client-customer-st-123456789",
            "proxy": null,
            "proxy_headers": null,
            "username": null,
            "validate_certs": null
        }
    },
    "msg": "Failed to execute on pod db-client-customer-st-123456789 due to : (0)\nReason: Handshake status 403 Forbidden\n"
}

Ansible & Module versions I use:

~/ansible-playbook --version
ansible-playbook 2.10.8
  config file = None
  configured module search path = ['/home/<user>/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  executable location = /usr/bin/ansible-playbook
  python version = 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0]
~/ ansible-galaxy collection list kubernetes.core                  
# /home/<user>/.ansible/collections/ansible_collections
Collection      Version
--------------- -------
kubernetes.core 2.3.2  

The k8s cluster I use is version 1.21 If I manually run kubectl exec on my terminal using the same kubeconfig file, I do not have any issues.

olahouze commented 1 year ago

Hello.

Is there news on this case ?

I am always facing the problem

Best regards

adesprez commented 4 months ago

I'm facing the same issue. I followed all related issues on GitHub regarding this. https://github.com/ansible/ansible/issues/45858 There was a fix here, but it's still intermittently happening for some people.

https://github.com/kubernetes-client/python/issues/678 See that comment: https://github.com/kubernetes-client/python/issues/678#issuecomment-867883391 I followed @thallesdaniell advice and tried out this: https://github.com/peak-ai/eks-token

So, instead of using the aws eks get-token exec on my kubeconfig file, I'm using that Python script

import json
import os

from eks_token import get_token

os.environ['AWS_PROFILE'] = '<myProfile>'
token = get_token(cluster_name='<clusterName>', role_arn='<roleArn>')
token_json = json.dumps(token)

print(token_json)

On the kubeconfig file:

users:
- name: <userArn>
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args:
      - /home/adesprez/Downloads/eks-get-token.py
      command: python
      env:
      - name: VIRTUAL_ENV
        value: /home/adesprez/venvs/ansible
      interactiveMode: IfAvailable
      provideClusterInfo: false

That kubeconfig file with the Python method is working fine with kubectl command.

But Ansible is still failing with the same :

{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"forbidden: User \\\\\"system:anonymous\\\\\" cannot get path \\\\\"/apis\\\\\"\",\"reason\":\"Forbidden\",\"details\":{},\"code\":403}

It's failing on every execution. Whereas kubectl is working fine with any of the methods: aws eks get-token or Python script. I tried several things to narrow done the issue:

So it does look like Ansible is completely failing on using that client.authentication.k8s.io/v1beta1 exec method, no matter what is the underlying command. As far as I know, there's no other way to authenticate with eks.

I'm really dry on ideas right now...

adesprez commented 4 months ago

I got it working with kubernetes.core.k8s, with which I got the exact same issue as with kubernetes.core.k8s_exec. I need to pass on the eks token into the api_key parameter.

- name: "Get the EKS token"
  command: aws eks get-token --profile <profileName> --cluster-name <clusterName> --role-arn <arn>
  environment:
    AWS_SHARED_CREDENTIALS_FILE: /home/<username>/.aws/credentials
  register: eks_token_json

- name: "Set EKS token"
  set_fact:
    eks_api_token: "{{ eks_token_json.stdout | from_json | json_query('status.token') }}"

- name: prometheus - create monitoring namespace - in case not present yet
  kubernetes.core.k8s:
    state: present
    api_version: v1
    kind: Namespace
    name: monitoring
    kubeconfig: "{{ kubeconfig }}"
    api_key: "{{ eks_api_token }}"

Notes:

It's over engineered but it's the only thing that worked.