ScaleComputing / HyperCoreAnsibleCollection

Official Ansible collection for Scale Computing SC//HyperCore (HC3) v1 API
GNU General Public License v3.0
12 stars 8 forks source link

:lady_beetle: Bug: hypercore 9.3 new machine types cause error #262

Closed ddemlow closed 9 months ago

ddemlow commented 10 months ago

Describe the bug

fatal: [192.168.0.230]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/local/bin/python3.10"}, "changed": false, "msg": "Virtual machine: ubtpmcompat-nogpu has an invalid Machine type: scale-uefi-tpm-compatible-9.3."} [atal: [vlb04a-01.lab.local]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/local/bin/python3.10"}, "changed": false, "msg": "Virtual machine: davetestuefi has an invalid Machine type: scale-uefi-9.3."}

To Reproduce tested against internal 9.3.0 release with this machine type - simply listing VMs throws error

valid 9.3 machine types Firmware Version: 9.3.0.212125 () Current Logged In Users: root pts/0 2023-09-12 13:18 (10.100.23.241) [192.168.18.35 (1a5fb91d) ~ 13:18:50]# sc vmmachinetypes show scale-bios-9.3 scale-7.2 scale-5.4 scale-8.10 scale-uefi-9.3 scale-uefi-tpm-compatible-9.3 scale-bios-lsi-9.2 scale-uefi-tpm-9.3 scale-6.4 scale-uefi-tpm-9.2

Expected behavior

list virtual machines as is. If possible to allow new / unexpected machine types - at least for viewing - may be more flexible going forward.

Screenshots

If applicable, add screenshots to help explain your problem.

System Info (please complete the following information):

Additional context

Add any other context about the problem here.

justinc1 commented 9 months ago

We wanted to vm and friends module to mimic what UI offers when creating/listing VM. That is BIOS, UEFI and vTMP+UEFI options - like here image.

Code currently assumes fixed mapping - https://github.com/ScaleComputing/HyperCoreAnsibleCollection/blob/main/plugins/module_utils/vm.py#L62-L66. I do not have access to v9.3 cluster, I guess scale-uefi-9.3 is show in UI as UEFI. But what could be show in UI for vTPM+UEFI - any of scale-uefi-tpm-9.2 scale-uefi-tpm-9.3 scale-uefi-tpm-compatible-9.3 sounds as good choice.

Easiest solution would be to just expose those strings unmodified, but than playbook author must know if say scale-uefi-9.3 type is supported. At the end playbooks will not be portable across different clusters, and that could do more harm than good.

Maybe for vm_info module: if scale_machine_type is not known, then module returns UNKNOWN for machine type. In addition module would always return scale_machine_type - value as returned by API.

And for vm module. I would try to keep same options for vm module as documented here https://scalecomputing.github.io/HyperCoreAnsibleCollection/collections/scale_computing/hypercore/vm_module.html#parameter-machine_type. This strings are also the same in HyperCore 9.3 @ddemlow ? But we will need to make mapping HyperCore version dependent.

What do you think?

ddemlow commented 9 months ago

these are the boot options displayed in UI for 9.3.1 image

currently we do also display the native machine type strings in UI image

I spoke to mitch @ scale and he is going to take the lead on this and ideal resolution for this (and going forward)

metchason commented 9 months ago

@justinc1 I know we recently implemented some changes on the HyperCore side to help with machine type handling, currently checking on if that can be of any help here.

Also if it is any help, I do not see our machine types changing this frequently over the long term

justinc1 commented 9 months ago

I think we should extend ansible vm_info module to return also unmodified machine type value (extra field named like machine_type_raw).

The input to vm module should allow also UEFI+vTMP (Compatibility) type. I think error should be returned if this type is used on HyperCore v9.2 or lower.

I would like to figure out what is mapping between strings in HyperCore UI and HyperCore RestAPI machine_type. Maybe the UEFI+vTMP from 9.2 get renamed in 9.3 to UEFI+vTMP (Compatibility)? I should test this on some test cluster. Can we setup a VSNS node? Last time @ddemlow setup them - I'm not sure if I can do this myself (I do not have root SSH key, and I do not know if there are some instructions how to do this).

Since the machine types do not change frequently, some HyperCore version dependent code in ansible collection should not become a maintenance nightmare. I mean, we will not need to update the mapping table on every minor HyperCore release.

Is the plan OK @metchason?

metchason commented 9 months ago

Ok, so after talking with my team I think there might be an option to provider for backwards compatibility here without requiring that users update their playbooks.

In Hypercore v9.3, we introduce the concept of machine type keywords. So Ansible can try to create VMs using machineTypeKeyword which is only available in 9.3. If the request fails, Ansible could fall back to using the older FROM_ANSIBLE_TO_HYPERCORE_MACHINE_TYPE mapping to identify the hard-coded machine type for the next attempt.

Here are the machine type keywords in HyperCore v9.3 Screenshot 2023-09-20 at 11 43 51 AM

@ddemlow I know you created a HyperCore virtual SNS on 9.3, was that to provide to XLAB or for your own purposes? If the latter, I can help run creating a new one down.

metchason commented 9 months ago

I did confirm, @justinc1 the new vSNS is accessible in the CI lab

justinc1 commented 9 months ago

Yes, ty for creating 9.3 vSNS. Will work on this now.