openstack-k8s-operators / edpm-ansible

External Dataplane Management Ansible Playbooks
https://openstack-k8s-operators.github.io/edpm-ansible/
Apache License 2.0
9 stars 66 forks source link

[pre-adoption-validation]Check tuned profile #641

Closed gibizer closed 6 months ago

gibizer commented 6 months ago

New check is added to edpm_pre_adoption_validation role that ensure that the tuned profile does not change during adoption as such change would require a reboot that is not possible during adoption.

Implements: OSPRH-5715

gibizer commented 6 months ago

/hold let's run this through in the data-plane-adoption CI ...

gibizer commented 6 months ago

testing in https://github.com/openstack-k8s-operators/data-plane-adoption/pull/424

gibizer commented 6 months ago

Maybe a small documentation update would be warranted. Other than that the code appears to be in good order, and I especially like the molecule test adjustment.

Thanks for the review. I pushed a followup to add more doc to the role https://github.com/openstack-k8s-operators/edpm-ansible/pull/642 Is this what you had in mind?

gibizer commented 6 months ago

The failure in the adoption job is relevant:

    "msg": "The task includes an option with an undefined variable. The error was: 'edpm_tuned_profile' is undefined. 'edpm_tuned_profile' is undefined\n\nThe error appears to be in '/usr/share/ansible/collections/ansible_collections/osp/edpm/roles/edpm_pre_adoption_validation/tasks/tuned.yml': line 35, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Validate that current active tuned profile won't change\n  ^ here\n"

While the edpm_tuned_profile ansible variable is visible to the edpm_pre_adoption role when it is defined in the inventory, the default value of the variable is only set by the edpm_tuned role when executed, but that role will not be executed in the validation deployment so no defaulting happen. I either need to find a way to re-use the defaulting or I need to duplicate the defaulting.

jpodivin commented 6 months ago

Maybe a small documentation update would be warranted. Other than that the code appears to be in good order, and I especially like the molecule test adjustment.

Thanks for the review. I pushed a followup to add more doc to the role #642 Is this what you had in mind?

Pretty much yes.

jpodivin commented 6 months ago

The failure in the adoption job is relevant:

�[0;31m    "msg": "The task includes an option with an undefined variable. The error was: 'edpm_tuned_profile' is undefined. 'edpm_tuned_profile' is undefined\n\nThe error appears to be in '/usr/share/ansible/collections/ansible_collections/osp/edpm/roles/edpm_pre_adoption_validation/tasks/tuned.yml': line 35, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Validate that current active tuned profile won't change\n  ^ here\n"�[0m

While the edpm_tuned_profile ansible variable is visible to the edpm_pre_adoption role when it is defined in the inventory, the default value of the variable is only set by the edpm_tuned role when executed, but that role will not be executed in the validation deployment so no defaulting happen. I either need to find a way to re-use the defaulting or I need to duplicate the defaulting.

You can create an arg in the role that will be filled with this variable if it exists, otherwise it will default to something. However, does it even make sense for the role to run in situations when the inventory var edpm_tuned_profile isn't set? If not, I would condition it's execution on that as well.

gibizer commented 6 months ago

The failure in the adoption job is relevant:

�[0;31m    "msg": "The task includes an option with an undefined variable. The error was: 'edpm_tuned_profile' is undefined. 'edpm_tuned_profile' is undefined\n\nThe error appears to be in '/usr/share/ansible/collections/ansible_collections/osp/edpm/roles/edpm_pre_adoption_validation/tasks/tuned.yml': line 35, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Validate that current active tuned profile won't change\n  ^ here\n"�[0m

While the edpm_tuned_profile ansible variable is visible to the edpm_pre_adoption role when it is defined in the inventory, the default value of the variable is only set by the edpm_tuned role when executed, but that role will not be executed in the validation deployment so no defaulting happen. I either need to find a way to re-use the defaulting or I need to duplicate the defaulting.

You can create an arg in the role that will be filled with this variable if it exists, otherwise it will default to something. However, does it even make sense for the role to run in situations when the inventory var edpm_tuned_profile isn't set? If not, I would condition it's execution on that as well.

If the inventory var edpm_tuned_profile isn't set then the edpm_tuned role will default it to throughput-performance via its role level default, and then applies that profile to the edpm node. So even if the user does not request any profile a specific profile will be applied. As the check added in this commit needs to see that profile to be applied is not different from the profile that is running the check needs the same default value as the edpm_tuned role. I can duplicate the default from edpm_tuned role into edpm_pre_adoption_validation role, but that could be problematic if we ever change the default in the edpm_tuned role and forget to update the default in the check role. In the meantime I found that include_vars works for pulling the default across roles at least in the molecule env. I will push an update here and restart the adoption job to see if this trick works or not in a real env. If not then I will fall back to duplication the defaults.

gibizer commented 6 months ago

/unhold

The new check run successfully in the adoption job https://logserver.rdoproject.org/24/424/1fb8008af212a42f96813eb8a3b2b141699c817d/github-check/data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph-rdojobs/affcc5e/controller/ci-framework-data/logs/openstack-k8s-operators-openstack-must-gather/namespaces/openstack/pods/pre-adoption-validation-openstack-pre-adoption-openstack-2xbm7/logs/openstackansibleee.log

So I think it is good to go

jpodivin commented 6 months ago

/unhold

The new check run successfully in the adoption job https://logserver.rdoproject.org/24/424/1fb8008af212a42f96813eb8a3b2b141699c817d/github-check/data-plane-adoption-osp-17-to-extracted-crc-minimal-no-ceph-rdojobs/affcc5e/controller/ci-framework-data/logs/openstack-k8s-operators-openstack-must-gather/namespaces/openstack/pods/pre-adoption-validation-openstack-pre-adoption-openstack-2xbm7/logs/openstackansibleee.log

So I think it is good to go

Great. But I would still feel better if there was a note in both roles about this var. Ansible behavior does change over time, and this could become a nasty surprise a year or more from now.

openshift-ci[bot] commented 6 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gibizer, jpodivin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openstack-k8s-operators/edpm-ansible/blob/main/OWNERS)~~ [gibizer,jpodivin] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment