ansible-collections / community.aws

Ansible Collection for Community AWS
GNU General Public License v3.0
188 stars 397 forks source link

The EKS cluster security token must be between 33 and 126 characters long...forcing the length of "name" parameter for aws_eks_cluster module to succeed. #817

Closed jdelaporte closed 2 years ago

jdelaporte commented 2 years ago

Summary

When leveraging a role that uses the aws_eks_cluster module, I encountered a repeated error about security token length. The issue was 'resolved' when I passed in a very long "name" parameter.

Issue Type

Bug Report

Component Name

aws_eks_cluster

Ansible Version

$ ansible --version
ansible [core 2.11.4] 
  config file = None
  configured module search path = ['/Users/jdelapor/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /Library/Python/3.8/site-packages/ansible
  ansible collection location = /Users/jdelapor/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.8.2 (default, Dec 21 2020, 15:06:04) [Clang 12.0.0 (clang-1200.0.32.29)]
  jinja version = 3.0.1
  libyaml = True

Collection Versions

$ ansible-galaxy collection list
# /Library/Python/3.8/site-packages/ansible_collections
Collection                    Version
----------------------------- -------
amazon.aws                    1.5.0  
ansible.netcommon             2.3.0  
ansible.posix                 1.2.0  
ansible.utils                 2.3.1  
ansible.windows               1.7.2  
arista.eos                    2.2.0  
awx.awx                       19.2.2 
azure.azcollection            1.8.0  
check_point.mgmt              2.0.0  
chocolatey.chocolatey         1.1.0  
cisco.aci                     2.0.0  
cisco.asa                     2.0.2  
cisco.intersight              1.0.16 
cisco.ios                     2.3.1  
cisco.iosxr                   2.4.0  
cisco.meraki                  2.4.2  
cisco.mso                     1.2.0  
cisco.nso                     1.0.3  
cisco.nxos                    2.5.0  
cisco.ucs                     1.6.0  
cloudscale_ch.cloud           2.2.0  
community.aws                 1.5.0  
community.azure               1.0.0  
community.crypto              1.8.0  
community.digitalocean        1.8.0  
community.docker              1.9.0  
community.fortios             1.0.0  
community.general             3.5.0  
community.google              1.0.0  
community.grafana             1.2.1  
community.hashi_vault         1.3.2  
community.hrobot              1.1.1  
community.kubernetes          1.2.1  
community.kubevirt            1.0.0  
community.libvirt             1.0.2  
community.mongodb             1.3.0  
community.mysql               2.1.0  
community.network             3.0.0  
community.okd                 1.1.2  
community.postgresql          1.4.0  
community.proxysql            1.1.0  
community.rabbitmq            1.1.0  
community.routeros            1.2.0  
community.skydive             1.0.0  
community.sops                1.1.0  
community.vmware              1.12.0 
community.windows             1.6.0  
community.zabbix              1.4.0  
containers.podman             1.6.2  
cyberark.conjur               1.1.0  
cyberark.pas                  1.0.7  
dellemc.enterprise_sonic      1.1.0  
dellemc.openmanage            3.6.0  
dellemc.os10                  1.1.1  
dellemc.os6                   1.0.7  
dellemc.os9                   1.0.4  
f5networks.f5_modules         1.11.0 
fortinet.fortimanager         2.1.3  
fortinet.fortios              2.1.2  
frr.frr                       1.0.3  
gluster.gluster               1.0.1  
google.cloud                  1.0.2  
hetzner.hcloud                1.4.4  
hpe.nimble                    1.1.3  
ibm.qradar                    1.0.3  
infinidat.infinibox           1.2.4  
inspur.sm                     1.2.0  
junipernetworks.junos         2.4.0  
kubernetes.core               1.2.1  
mellanox.onyx                 1.0.0  
netapp.aws                    21.6.0 
netapp.azure                  21.8.1 
netapp.cloudmanager           21.9.0 
netapp.elementsw              21.6.1 
netapp.ontap                  21.9.0 
netapp.um_info                21.7.0 
netapp_eseries.santricity     1.2.13 
netbox.netbox                 3.1.1  
ngine_io.cloudstack           2.1.0  
ngine_io.exoscale             1.0.0  
ngine_io.vultr                1.1.0  
openstack.cloud               1.5.0  
openvswitch.openvswitch       2.0.0  
ovirt.ovirt                   1.5.4  
purestorage.flasharray        1.10.0 
purestorage.flashblade        1.6.0  
sensu.sensu_go                1.11.1 
servicenow.servicenow         1.0.6  
splunk.es                     1.0.2  
t_systems_mms.icinga_director 1.20.0 
theforeman.foreman            2.1.2  
vyos.vyos                     2.5.0  
wti.remote                    1.0.1  

# /Users/jdelapor/.ansible/collections/ansible_collections
Collection         Version
------------------ -------
amazon.aws         1.5.0  
azure.azcollection 1.8.0  
community.aws      1.5.0  
google.cloud       1.0.2  
kubernetes.core    2.1.1  

AWS SDK versions

$ pip show boto boto3 botocore
Name: boto
Version: 2.49.0
Summary: Amazon Web Services Library
Home-page: https://github.com/boto/boto/
Author: Mitch Garnaat
Author-email: mitch@garnaat.com
License: MIT
Location: /Users/jdelapor/Library/Python/3.8/lib/python/site-packages
Requires: 
Required-by: 
---
Name: boto3
Version: 1.20.17
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: /Users/jdelapor/Library/Python/3.8/lib/python/site-packages
Requires: s3transfer, jmespath, botocore
Required-by: 
---
Name: botocore
Version: 1.23.17
Summary: Low-level, data-driven core of boto 3.
Home-page: https://github.com/boto/botocore
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: /Users/jdelapor/Library/Python/3.8/lib/python/site-packages
Requires: jmespath, python-dateutil, urllib3
Required-by: s3transfer, boto3, awscli

Configuration

$ ansible-config dump --only-changed
null

OS / Environment

EKS

Steps to Reproduce

Running the playbook from https://github.com/nleiva/ansible-kubernetes, it fails at this task if I use a short cluster name:

- name: Create EKS cluster
  community.aws.aws_eks_cluster:
    name: "{{ cluster_name }}"
    version: "{{ k8s_version }}"
    role_arn: "{{ eks_role.arn }}"
    wait: true
    wait_timeout: "{{ eks_timeout }}"
    region: "{{ aws_region }}"
    subnets:
      - "{{ create_subnet_1.subnet.id }}"
      - "{{ create_subnet_2.subnet.id }}"
    security_groups: "{{ ec2_resource_prefix }}-SG"
  register: create_eks

It fails when I use this command to run the playbook:

ansible-playbook main.yml -vvv --extra-vars "cloud_provider=aws my_resource_prefix=jdelaportek8s my_cluster_name=jdelaportek81"

It succeeds when I run the playbook with this command:

ansible-playbook main.yml -vvv --extra-vars "cloud_provider=aws my_resource_prefix=jdelaportek8s my_cluster_name=jdelaportek81isthislongenoughforyouyetbotocore"

Expected Results

I expected a cluster to be created based on any nominally normal cluster name length. There is no length restriction mentioned in the aws_eks_cluster module. The examples show a 10-char name length, which would be too short to succeed.

Actual Results

Using the aws eks role located here: https://github.com/nleiva/ansible-kubernetes

ansible-playbook main.yml -v --extra-vars "cloud_provider=aws my_resource_prefix=jdelaportek8s my_cluster_name=jdelaportek81"

---- truncated ----
TASK [aws_create_eks : Provision EKS Cluster tasks] **********************************************************************************************************************************************
included: /Users/jdelapor/gitstuff/ansible-kubernetes/roles/aws_create_eks/tasks/create_eks.yml for localhost

TASK [aws_create_eks : Create EKS cluster] *******************************************************************************************************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: botocore.errorfactory.InvalidParameterException: An error occurred (InvalidParameterException) when calling the CreateCluster operation: The client request token parameter must be between 33 and 126 characters.
fatal: [localhost]: FAILED! => {
"boto3_version": "1.20.17",
"botocore_version": "1.23.17",
"changed": false,
"error": {
"code": "InvalidParameterException",
"message": "The client request token parameter must be between 33 and 126 characters."
},
"message": "The client request token parameter must be between 33 and 126 characters.",
"response_metadata": {
"http_headers": {
"access-control-allow-headers": ",Authorization,Date,X-Amz-Date,X-Amz-Security-Token,X-Amz-Target,content-type,x-amz-content-sha256,x-amz-user-agent,x-amzn-platform-id,x-amzn-trace-id",
"access-control-allow-methods": "GET,HEAD,PUT,POST,DELETE,OPTIONS",
"access-control-allow-origin": "",
"access-control-expose-headers": "x-amzn-errortype,x-amzn-errormessage,x-amzn-trace-id,x-amzn-requestid,x-amz-apigw-id,date",
"connection": "keep-alive",
"content-length": "196",
"content-type": "application/json",
"date": "Wed, 01 Dec 2021 22:55:01 GMT",
"x-amz-apigw-id": "JsSC1EIDIAMFT-g=",
"x-amzn-errortype": "InvalidParameterException",
"x-amzn-requestid": "d6ee553b-0e0f-4b68-81ff-e1c434fde032",
"x-amzn-trace-id": "Root=1-61a7fd45-71e15c9a2fab959d278d1075"
},
"http_status_code": 400,
"request_id": "d6ee553b-0e0f-4b68-81ff-e1c434fde032",
"retry_attempts": 0
}
}

MSG:

Couldn't create cluster my-cluster: An error occurred (InvalidParameterException) when calling the CreateCluster operation: The client request token parameter must be between 33 and 126 characters.

PLAY RECAP ***************************************************************************************************************************************************************************************
localhost : ok=27 changed=12 unreachable=0 failed=1 skipped=7 rescued=0 ignored=0

Code of Conduct

markuman commented 2 years ago

I can confirm this.
In your scenario only 5 characters are missing:

---
- hosts: localhost
  connection: local

  tasks:
    - name: Create EKS cluster
      community.aws.aws_eks_cluster:
        name: jdelaportek8112345
        version: 1.21
        role_arn: myEKSClusterRole
        wait: true
        region: eu-central-1
        subnets:
          - subnet-d8309db2
          - subnet-943ad4d8
        security_groups:
          - sg-f32f0196
      register: create_eks

This is the bug: https://github.com/ansible-collections/community.aws/blob/main/plugins/modules/aws_eks_cluster.py#L207

clientRequestToken (string) -- Unique, case-sensitive identifier you provide to ensure the idempotency of the request. This field is autopopulated if not provided.

This token must be between 33 and 126 characters ...
We can remove it to see if our integration tests still pass

nleiva commented 2 years ago

I also experienced the same behavior. Could we do something like this?

clientRequestToken='ansible-create-something-very-long-just-because-%s' % name)
markuman commented 2 years ago

I also experienced the same behavior. Could we do something like this?

clientRequestToken='ansible-create-something-very-long-just-because-%s' % name)

I bet one day someone will come with a name that is longer as 126 characters :)

I guess the clientRequestToken is not necessary. Integration test pass without it. I will append it with a short name.

nleiva commented 2 years ago

I bet one day someone will come with a name that is longer as 126 characters :)

I guess the clientRequestToken is not necessary. Integration test pass without it. I will append it with a short name.

That's true.

I don't have much context of where this token comes from or what it's for, but if it's not required and it's safe to remove it, then I guess it's ok.

Keeping the Token with a longer prefix would be a less 'disruptive' change and we can manage/fail locally in the code and report it with a useful message to the user if the length exceeds 126 characters. I'm ok either way.

Thanks

jdelaporte commented 2 years ago

From the AWS EKS documentation, the clientRequestToken is not required: https://docs.aws.amazon.com/eks/latest/APIReference/API_CreateCluster.html#AmazonEKS-CreateCluster-request-clientRequestToken

It is not clear if it is auto-generated if not provided, but that could be determined by examining the response (cluster) object that is created without a token. It is meant for idempotency, according to the AWS doc. So, it is good to set it when the name is short enough.

markuman commented 2 years ago

It is meant for idempotency, according to the AWS doc. So, it is good to set it when the name is short enough.

I wonder what this means in the context of ansible.
the create_cluster function is requested only once (when the cluster does not exist).

and in a general context. Even if the token is changing, you can run the create_cluster statement only once.

import boto3

eks = boto3.client('eks')

response = eks.create_cluster(
    name='b',
    version='1.21',
    roleArn='arn:aws:iam::123:role/myEKSClusterRole',
    resourcesVpcConfig={
        'subnetIds': [
            'subnet-d8309db2', 'subnet-943ad4d8'
        ],
        'securityGroupIds': [
            'sg-f32f0196',
        ]
    },
    clientRequestToken='stringstringstringstringstringstring'
)

print(response)

response = eks.create_cluster(
    name='b',
    version='1.21',
    roleArn='arn:aws:iam::123:role/myEKSClusterRole',
    resourcesVpcConfig={
        'subnetIds': [
            'subnet-d8309db2', 'subnet-943ad4d8'
        ],
        'securityGroupIds': [
            'sg-f32f0196',
        ]
    },
    clientRequestToken='stringstringstringstringstringstring1'
)

print(response)

restults in

{'ResponseMetadata': {'RequestId': '05bfc8b2-ee3c-4f25-ab24-bd231e5224b8', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Thu, 02 Dec 2021 19:43:59 GMT', 'content-type': 'application/json', 'content-length': '820', 'connection': 'keep-alive', 'x-amzn-requestid': '05bfc8b2-ee3c-4f25-ab24-bd231e5224b8', 'access-control-allow-origin': '*', 'access-control-allow-headers': '*,Authorization,Date,X-Amz-Date,X-Amz-Security-Token,X-Amz-Target,content-type,x-amz-content-sha256,x-amz-user-agent,x-amzn-platform-id,x-amzn-trace-id', 'x-amz-apigw-id': 'JvI_zGeAliAFZrA=', 'access-control-allow-methods': 'GET,HEAD,PUT,POST,DELETE,OPTIONS', 'access-control-expose-headers': 'x-amzn-errortype,x-amzn-errormessage,x-amzn-trace-id,x-amzn-requestid,x-amz-apigw-id,date', 'x-amzn-trace-id': 'Root=1-61a921fe-3529fabd0c39a93b6e9340f6'}, 'RetryAttempts': 0}, 'cluster': {'name': 'b', 'arn': 'arn:aws:eks:eu-central-1:123:cluster/b', 'createdAt': datetime.datetime(2021, 12, 2, 20, 43, 59, 671000, tzinfo=tzlocal()), 'version': '1.21', 'roleArn': 'arn:aws:iam::123:role/myEKSClusterRole', 'resourcesVpcConfig': {'subnetIds': ['subnet-d8309db2', 'subnet-943ad4d8'], 'securityGroupIds': ['sg-f32f0196'], 'vpcId': 'vpc-6731f40d', 'endpointPublicAccess': True, 'endpointPrivateAccess': False, 'publicAccessCidrs': ['0.0.0.0/0']}, 'kubernetesNetworkConfig': {'serviceIpv4Cidr': '10.100.0.0/16'}, 'logging': {'clusterLogging': [{'types': ['api', 'audit', 'authenticator', 'controllerManager', 'scheduler'], 'enabled': False}]}, 'status': 'CREATING', 'certificateAuthority': {}, 'platformVersion': 'eks.3', 'tags': {}}}
Traceback (most recent call last):
  File "/tmp/test_eks.py", line 6, in <module>
    response = eks.create_cluster(
  File "/home/m/.local/lib/python3.9/site-packages/botocore/client.py", line 388, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/m/.local/lib/python3.9/site-packages/botocore/client.py", line 708, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ResourceInUseException: An error occurred (ResourceInUseException) when calling the CreateCluster operation: Cluster already exists with name: b

So, it is good to set it when the name is short enough.

But yes, we can implement this logic for safety