ansible-collections / kubernetes.core

The collection includes a variety of Ansible content to help automate the management of applications in Kubernetes and OpenShift clusters, as well as the provisioning and maintenance of clusters themselves.
Other
216 stars 135 forks source link

Un-wanted filtering of `release_values` when using `kubernetes.core.helm` #787

Open spatterIight opened 1 day ago

spatterIight commented 1 day ago
SUMMARY

When using the kubernetes.core.helm module with the release_values parameter it is possible to encounter an issue stemming from filtering that occurs in the yaml.dump operation.

The yaml.dump operation associated with release_values is on this line of code.

The filtering applied by yaml.dump removes single and double quotes from variables. This causes an issue with certain helm charts that require specific values to be strings and not booleans.

For example:

TASK [postgres-operator : Deploy postgres-operator] ********************
fatal: [node-01 -> 127.0.0.1]: FAILED! => changed=false 
  command: /usr/bin/helm upgrade -i --reset-values --reuse-values=False --wait --force -f=/tmp/tmpptkljwkf.yml postgres-operator-cluster '/home/username/deployment/roles-galaxy/postgres-operator/files/Chart/postgres'
  msg: |-
    Failure when executing Helm command. Exited 1.
    stdout:
    stderr: Error: UPGRADE FAILED: failed to replace object: PostgresCluster.postgres-operator.crunchydata.com "postgres-operator-cluster" is invalid: spec.backups.pgbackrest.global.repo2-storage-verify-tls: Invalid value: "boolean": spec.backups.pgbackrest.global.repo2-storage-verify-tls in body must be of type string: "boolean"
  stderr: |-
    Error: UPGRADE FAILED: failed to replace object: PostgresCluster.postgres-operator.crunchydata.com "postgres-operator-cluster" is invalid: spec.backups.pgbackrest.global.repo2-storage-verify-tls: Invalid value: "boolean": spec.backups.pgbackrest.global.repo2-storage-verify-tls in body must be of type string: "boolean"
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>

In the above error the CrunchyData postgres-operator chart requires that repo2-storage-verify-tls be a string -- but yaml.dump has converted it to a boolean.

Actual value: repo2-storage-verify-tls: "n"

Value after passing through yaml.dump: repo2-storage-verify-tls: n

ISSUE TYPE
COMPONENT NAME

kubernetes.core.helm

ANSIBLE VERSION
ansible [core 2.17.5]
configured module search path = ['/home/username/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3.12/site-packages/ansible
ansible collection location = /home/username/deployment/collections-galaxy
executable location = /usr/bin/ansible
python version = 3.12.7 (main, Oct  1 2024, 11:15:50) [GCC 14.2.1 20240910] (/usr/bin/python)
jinja version = 3.1.4
libyaml = True
COLLECTION VERSION
# /home/username/deployment/collections-galaxy/ansible_collections
Collection        Version
----------------- -------
community.general 9.4.0

# /usr/lib/python3.12/site-packages/ansible_collections
Collection        Version
----------------- -------
community.general 9.5.0
CONFIGURATION
COLLECTIONS_PATHS(/home/username/deployment/ansible.cfg) = ['/home/username/deployment/collections-galaxy']
CONFIG_FILE() = /home/username/deployment/ansible.cfg
DEFAULT_ASK_VAULT_PASS(/home/username/deployment/ansible.cfg) = False
DEFAULT_FORKS(/home/username/deployment/ansible.cfg) = 50
DEFAULT_HOST_LIST(/home/username/deployment/ansible.cfg) = ['/home/username/deployment/hosts.ini']
DEFAULT_ROLES_PATH(/home/username/deployment/ansible.cfg) = ['/home/username/deployment/roles-galaxy']
DEFAULT_STDOUT_CALLBACK(/home/username/deployment/ansible.cfg) = community.general.yaml
DEFAULT_TIMEOUT(/home/username/deployment/ansible.cfg) = 240
DEFAULT_VAULT_ID_MATCH(/home/username/deployment/ansible.cfg) = True
EDITOR(env: EDITOR) = /usr/bin/micro
RETRY_FILES_ENABLED(/home/username/deployment/ansible.cfg) = False
OS / ENVIRONMENT
STEPS TO REPRODUCE
EXPECTED RESULTS

I expected release_values to preserve the Ansible variable I passed into it without applying any filtering to it. If I wanted to apply filtering to it I would use the to_nice_yaml filter before handing it off to the Helm module.

ACTUAL RESULTS

What actually happened is my variable got converted from a string to a boolean

Error: UPGRADE FAILED: failed to replace object: PostgresCluster.postgres-operator.crunchydata.com "postgres-operator-cluster" is invalid: spec.backups.pgbackrest.global.repo2-storage-verify-tls: Invalid value: "boolean": spec.backups.pgbackrest.global.repo2-storage-verify-tls in body must be of type string: "boolean"
Workaround

As a workaround it is possible to use values_files instead of release_values -- this bypasses the filtering operation.

Thanks !

gravesm commented 11 hours ago

This is a limitation of yaml syntax. If you need double quotes to come through in a string value you would have to do something like repo2-storage-verify-tls: '"n"' or repo2-storage-verify-tls: "\"n\"".

spatterIight commented 11 hours ago

This is a limitation of yaml syntax. If you need double quotes to come through in a string value you would have to do something like repo2-storage-verify-tls: '"n"' or repo2-storage-verify-tls: "\"n\"".

This does not work, even with these escapes applied the variable is not correct:

Actual value: repo2-storage-verify-tls: "\"n\"" OR repo2-storage-verify-tls: '"n"'

Value after passing through yaml.dump: repo2-storage-verify-tls: '"n"'


In this case Helm chart does NOT complain about the datatype -- but the variable still does not have the right value so the pod fails:

+ pgbackrest restore --type=time '--target=2024-09-20 12:00:00-04' --stanza=db --pg1-path=/pgdata/pg14 --repo=2 --delta --target-action=promote --link-map=pg_wal=/pgdata/pg14_wal
ERROR: [032]: boolean option 'repo2-storage-verify-tls' must be 'y' or 'n'

Thank you for the response 🫡 ultimately my main goal here is to simply document this issue in-case someone else has the same one -- took me forever to figure out what was happening.