Cray-HPE / sat

System Admin Toolkit
https://cray-hpe.github.io/docs-sat/
MIT License
4 stars 5 forks source link

CRAYSAT-1896: Add support for CFS v2 or v3 to `sat bootprep` #288

Open haasken-hpe opened 2 weeks ago

haasken-hpe commented 2 weeks ago

Summary and Scope

Add support for using either the CFS v2 or v3 API in sat bootprep based on the value of the command-line option --cfs-version or the config-file option cfs.api_version. Before this change, sat bootprep was hard-coded to always use CFS v2.

Update the InputConfiguration, InputConfigurationLayer, and AdditionalInventory classes to keep a reference to the instance of the version-specific subclass of CFSClientBase, so that they can delegate operations to the appropriate version of the CFS API.

This removes a lot of code that required knowledge of the CFS API and instead uses classes and methods defined in the csm_api_client library, which has already been updated to support both CFS v2 and v3. This should reduce code duplication across sat and csm_api_client.

Some unit tests are no longer needed here because that functionality is tested in the csm_api_client instead. Other unit tests are updated to include the cfs_client instance that needs to be passed into the input layer and configuration objects.

This also makes the playbook property required in the schema because the CFS v3 API requires that the playbook be specified. This is technically a backwards-incompatible schema change because it can invalidate certain bootprep input files if they define layers without playbooks. The default bootprep files contain a playbook for every layer and do not depend on the CFS default playbook value, so this should be a pretty safe and non-disruptive change.

Issues and Related PRs

Testing

Tested on:

Test description:

Unit tests only so far.

Still needs to be tested on a real system.

Risks and Mitigations

This has a moderate risk level. It introduces a bootprep input file schema change that can cause files that do not specify playbook to not validate anymore.

It also is a significant code refactor in that it pushes a lot of the CFS-API knowledge out of the bootprep code and offloads it to the csm_api_client library.

These risks can be mitigated by thorough testing to ensure that CFS configuration creation and image customization still works in sat bootprep.

Pull Request Checklist

haasken-hpe commented 2 weeks ago

Keeping this as a draft until I test, but it's a fairly large PR, so feel free to look at it before I finish testing and let me know if you have any questions or concerns.

haasken-hpe commented 2 weeks ago

It looks like I need to still add support for pagination of resources other than sessions (e.g. configurations). We do a GET of /configurations when we are checking whether any of the configurations that are going to be created by the input file already exist in CFS.

So I need to add paging support for other CFS API resources to python-csm-api-client and then update this code to use that. It should be a pretty straightforward change. I'll work on that next week.

Here's the error I get right now without this support for paging:

Traceback (most recent call last):
  File "/sat/venv/bin/sat", line 8, in <module>
    sys.exit(main())
  File "/sat/venv/lib/python3.9/site-packages/sat/main.py", line 124, in main
    subcommand(args)
  File "/sat/venv/lib/python3.9/site-packages/sat/cli/bootprep/main.py", line 405, in do_bootprep
    do_bootprep_run(schema_validator, args)
  File "/sat/venv/lib/python3.9/site-packages/sat/cli/bootprep/main.py", line 255, in do_bootprep_run
    instance.input_configurations.handle_existing_items(args.overwrite_configs,
  File "/sat/venv/lib/python3.9/site-packages/sat/cli/bootprep/input/base.py", line 456, in handle_existing_items
    existing_items_by_name = self.get_existing_items_by_name()
  File "/sat/venv/lib/python3.9/site-packages/sat/cli/bootprep/input/configuration.py", line 458, in get_existing_items_by_name
    return {
  File "/sat/venv/lib/python3.9/site-packages/sat/cli/bootprep/input/configuration.py", line 459, in <dictcomp>
    configuration.get('name'): [configuration]
AttributeError: 'str' object has no attribute 'get'

This is because a GET to /configurations now returns an object containing the array of configurations under a new key instead of just a plain array.

haasken-hpe commented 2 weeks ago

I also still need to add the --cfs-version command-line option to the bootprep command.

haasken-hpe commented 2 weeks ago

I appear to have broken the ability to specify "latest" as the product version in a configuration layer as well. I'll have to look into that.

haasken-hpe commented 1 week ago

I have fixed the issues noted in my prior comments. That is:

haasken-hpe commented 1 week ago

Minimal testing performed so far:

ncn-m001:~/haasken/CRAYSAT-1896 # cat bootprep.yaml
---
configurations:
- name: haasken-test
  layers:
  - name: csm-layer
    product:
      name: csm
      version: 1.5.2
    playbook: csm_packages.yml
  - name: csm-layer
    product:
      name: csm
      version: latest
    playbook: site.yml
ncn-m001:~/haasken/CRAYSAT-1896 # sat --loglevel debug  bootprep run  bootprep.yaml --save-files
INFO: Validating given input file bootprep.yaml
WARNING: Schema version specified in input file (1.0.0) has an older minor version than current schema version (1.1.0). Unexpected behavior may occur.
INFO: Input file successfully validated against schema
DEBUG: Loaded auth token from /root/.config/sat/tokens/api_gw_service_nmn_local.vers.json.
DEBUG: Couldn't load the in-cluster config: Service host/port is not set. (proceeding under the assumption that the config should be loaded from the kubeconfig file)
DEBUG: Issuing GET request to URL 'https://api-gw-service-nmn.local/apis/cfs/v3/configurations'
DEBUG: Received response to GET request to URL 'https://api-gw-service-nmn.local/apis/cfs/v3/configurations' with status code: '200': OK
1 CFS configuration already exists with the name haasken-test. Would you like to skip, overwrite, or abort? [skip,overwrite,abort] overwrite
INFO: 1 CFS configuration already exists with the name haasken-test and will be overwritten.
INFO: Creating 1 CFS configuration
INFO: Creating CFS configuration at index 0 with name=haasken-test
INFO: Saving CFS configuration request body to ./cfs-configuration-haasken-test.json
DEBUG: Issuing PUT request to URL 'https://api-gw-service-nmn.local/apis/cfs/v3/configurations/haasken-test'
DEBUG: Received response to PUT request to URL 'https://api-gw-service-nmn.local/apis/cfs/v3/configurations/haasken-test' with status code: '200': OK
INFO: Successfully created CFS configuration at index 0 with name=haasken-test
DEBUG: Given input file did not define any images, so skipping validation.
INFO: Given input did not define any IMS images
INFO: Nothing to create in collection of BOS session templates
################################################################################
CFS configurations
################################################################################
+--------------+
| name         |
+--------------+
| haasken-test |
+--------------+
ncn-m001:~/haasken/CRAYSAT-1896 # cray cfs v3 configurations describe haasken-test
{
  "last_updated": "2024-11-18T23:46:06Z",
  "layers": [
    {
      "clone_url": "https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git",
      "commit": "ac252601e3df13b70d475dff8ee60fb45f1c27b2",
      "name": "csm-layer",
      "playbook": "csm_packages.yml"
    },
    {
      "clone_url": "https://api-gw-service-nmn.local/vcs/cray/csm-config-management.git",
      "commit": "ac252601e3df13b70d475dff8ee60fb45f1c27b2",
      "name": "csm-layer",
      "playbook": "site.yml"
    }
  ],
  "name": "haasken-test"
}

I need to do some more thorough testing, and while doing so, it would probably be nice to think about how we can start creating some automated tests of the sat bootprep functionality to include in the CT framework.

haasken-hpe commented 5 days ago

Here are my testing results from rocket: https://gist.github.com/haasken-hpe/8cc8455889b7afe729c5b77dcebfc0f7

I have only done limited testing with --cfs-version v2, and I'm just using the v3 default in these tests, but I have done some minimal testing with v2, and I think we can be sure to test this with our automated functional tests we are going to write.

haasken-hpe commented 4 days ago

Since I have merged https://github.com/Cray-HPE/python-csm-api-client/pull/47 and have pushed a release/2.3 branch there, we now have a stable build, so I am rebasing this PR to drop the commit that references the unstable repo.