SovereignCloudStack / standards

SCS standards in a machine readable format
https://scs.community/
Creative Commons Attribution Share Alike 4.0 International
34 stars 24 forks source link

[Other] Evaluate SCS-compliance of Yaook with respect to IaaS standards beyond SCS-compatible IaaS v4 #718

Closed anjastrunk closed 1 hour ago

anjastrunk commented 2 months ago

SCS-compatible IaaS v4 includes the following standards:

Beyond that, the new standards were defined and may be part of SCS-compatible IaaS v5. We have to evaluate if [FitKo Poc] and C&H SCS test cluster is SCS-compliant with respect to the following new IaaS standards:

Edit: With the new v5 scope there are additional standards we need to evaluate: https://github.com/SovereignCloudStack/standards/issues/807

mbuechse commented 2 months ago

110 is a DR, not new

anjastrunk commented 2 months ago

110 is a DR, not new

Do we plan to stabilize this draft in the near future?

mbuechse commented 2 months ago

110 is a DR, not new

Do we plan to stabilize this draft in the near future?

It is stable.

anjastrunk commented 2 months ago

110 is a DR, not new

Do we plan to stabilize this draft in the near future?

It is stable.

I see, it is a Decision Record not a standard. My fault ;-) I will remove 110 from Todo list.

josephineSei commented 2 months ago

Conformance with Volume Type Standard

The Volume Type standard currently does NOT REQUIRE any changes, it only RECOMMENDS it.

Recommended volume types

  1. At least one type with REPLICATION
  2. At least one type with ENCRYPTION

Both recommendations can be fulfilled with having one volume type, that has e.g. replicated ceph storage and an encryption type. (So a single volume type would be enough to fulfill all recommendations.)

How to fulfill the recommendations?

Volume Type with Replication

There are several options to fulfill this:

1. Ceph used as backend Ceph has configured internal replication, that is deemed sufficient.

2. other backends are used It needs to be checked, whether they have internal replication. Otherwise Option 3 must be used.

3. OpenStack replication is configured

In Addition to all these Options the description of the Volume Type has to be adjusted:

+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field              | Value                                                                                                                                                        |
+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| access_project_ids | None                                                                                                                                                         |
| description        | [scs:encrypted, replicated] Content will be replicated three times to ensure consistency and availability for your data. LUKS encryption is used.           |
| id                 | d63307fb-167a-4aa0-9066-66595ea9fb21                                                                                                                         |
| is_public          | True                                                                                                                                                         |
| name               | hdd-three-replicas-LUKS                                                                                                                                      |
| properties         |                                                                                                                                                              |
| qos_specs_id       | None                                                                                                                                                         |
+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+

Volume Type with Encryption

This will require the following

Does Yaook fulfill this?

  1. If Ceph is used, we only need to update the description of the Volume Type to fulfill the Replication recommendation. This is a very small adjustment.

  2. To fulfill the Encryption recommendation, we need: 2.1. a working Key Manager Operator 2.2. the Config changes is Cinder and Nova (may already be included) 2.3. the setup of a volume type with encryption type (was tested in the past - we need to make sure everything still works) 2.4. change the description of the volume type

stack@devstack:~/devstack$ openstack volume type show LUKS
+--------------------+--------------------------------------+
| Field              | Value                                |
+--------------------+--------------------------------------+
| access_project_ids | None                                 |
| description        | None                                 |
| id                 | 15002270-b9ed-4b7f-ba08-cea35f3acc1f |
| is_public          | True                                 |
| metadata           | {}                                   |
| name               | LUKS                                 |
| properties         |                                      |
| qos_specs_id       | None                                 |
+--------------------+--------------------------------------+
stack@devstack:~/devstack$ openstack volume type set --description "scs:encrypted, replicated] Content will be replicated three times to ensure consistency and availability for your data. LUKS encryption is used." LUKS
stack@devstack:~/devstack$ openstack volume type show LUKS
+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| Field              | Value                                                                                                                                            |
+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+
| access_project_ids | None                                                                                                                                             |
| description        | [scs:encrypted, replicated] Content will be replicated three times to ensure consistency and availability for your data. LUKS encryption is used. |
| id                 | 15002270-b9ed-4b7f-ba08-cea35f3acc1f                                                                                                             |
| is_public          | True                                                                                                                                             |
| metadata           | {}                                                                                                                                               |
| name               | LUKS                                                                                                                                             |
| properties         |                                                                                                                                                  |
| qos_specs_id       | None                                                                                                                                             |
+--------------------+--------------------------------------------------------------------------------------------------------------------------------------------------+

Setting descriptions on volume types is still possible when the the volume type is already in use (volumes with this type exist).

josephineSei commented 2 months ago

Conformance with Default Rules for Security Groups Standard

The standard just requires the following default rules for security groups, which ALL are the provided when installing OpenStack without adjustment. So this standard just state, that no life-cycle management system should touch these rules (and operators should neither).

stack@devstack:~/devstack$ openstack default security group rule list
+------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+----------------------+--------------------------------+-------------------------------+
| ID                     | IP Protocol | Ethertype | IP Range  | Port Range | Direction | Remote Security Group | Remote Address Group | Used in default Security Group | Used in custom Security Group |
+------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+----------------------+--------------------------------+-------------------------------+
| 2b82f187-b385-4fae-    | None        | IPv6      | ::/0      |            | egress    | None                  | None                 | True                           | True                          |
| 9142-6a72ace72fac      |             |           |           |            |           |                       |                      |                                |                               |
| 315c216b-05ef-42a4-    | None        | IPv4      | 0.0.0.0/0 |            | ingress   | PARENT                | None                 | True                           | False                         |
| a0ca-84f7358208c2      |             |           |           |            |           |                       |                      |                                |                               |
| 44700fc3-d576-4dce-    | None        | IPv4      | 0.0.0.0/0 |            | egress    | None                  | None                 | True                           | True                          |
| bf7f-b530b9ef1074      |             |           |           |            |           |                       |                      |                                |                               |
| e66c8e2c-8d42-4a0f-    | None        | IPv6      | ::/0      |            | ingress   | PARENT                | None                 | True                           | False                         |
| a84d-deb7e9c37b6b      |             |           |           |            |           |                       |                      |                                |                               |
+------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+----------------------+--------------------------------+-------------------------------+

This is most likely already be satisfied with yaook. The only thing left to do:

josephineSei commented 2 months ago

Conformance with Key Manager Standard

Right now all part of this standard are only recommendations. So every tool will right now fulfill this standard.

To also fulfill recommendations, the following things must be considered:

1. There SHOULD be a key manager

2. The Master KEK SHOULD NOT be written in plain text in the barbican.conf file

This can be achieved at this point of time through using any other plugin than the simple crypto plugin.

Right now yaook seems to use the simple crypto plugin. See here.

Yaook should support integration for other plugins, too. This may be the biggest point, because it needs lots of testing and the other standards only need small changes because Barbican can already be integrated into yaook deployments.

martinmo commented 2 months ago

To allow everyone involved in this ticket to run the conformance tests against a Yaook cluster, I created projects and accounts for @fraugabel, @josephineSei and @markus-hentsch on our SCS Yaook PoC deployment.

fraugabel commented 2 months ago

thanks @martinmo works for me: i will start with writing the test for the security group rules scs-0115-v1: Default Rules for Security Groups

josephineSei commented 2 months ago

@fraugabel I don't know, if you know: We write tests for the standards already when writing them. Here is the test for the default security group rules: https://github.com/SovereignCloudStack/standards/blob/main/Tests/iaas/security-groups/default-security-group-rules.py

mbuechse commented 2 months ago

@fraugabel I don't know, if you know: We write tests for the standards already when writing them. Here is the test for the default security group rules: https://github.com/SovereignCloudStack/standards/blob/main/Tests/iaas/security-groups/default-security-group-rules.py

How could she (and I, for that matter)? The standard only states what the test should do. It doesn't reference any test.

mbuechse commented 2 months ago

@fraugabel then you can test the existing test script and please update the standard

fraugabel commented 2 months ago

i find it a bit inconsistent that the argument for the os cloud in the default-security-group-rules test differs from the argument of the overall complience test:

usage: default-security-group-rules.py [-h] [--os-cloud OS_CLOUD] [--debug]

SCS Default Security Group Rules Checker

options:
  -h, --help           show this help message and exit
  --os-cloud OS_CLOUD  Name of the cloud from clouds.yaml, alternative to the OS_CLOUD environment variable
  --debug              Enable OpenStack SDK debug logging

whereas the argument in the complience tests would be:

./scs-compliance-check.py -s CLOUDNAME -a os_cloud=CLOUDNAME scs-compatible-iaas.yaml
mbuechse commented 2 months ago

@fraugabel this is by design. I can explain when I'm back

mbuechse commented 2 months ago

Keep in mind that the main script is agnostic of OpenStack. It can be used to call all manner of test scripts

fraugabel commented 2 months ago

@fraugabel this is by design. I can explain when I'm back

@mbuechse i just stumbled across that and wondered whether there is an agreement on argument namings like os-cloud vs os_cloud

mbuechse commented 2 months ago

@fraugabel You're right. It's a rather common transformation between cli arguments and variable names: _ in variable name becomes - in cli argument and vice versa. The Python library click for instance does this automatically IIRC.

fraugabel commented 2 months ago

@fraugabel You're right. It's a rather common transformation between cli arguments and variable names: _ in variable name becomes - in cli argument and vice versa. The Python library click for instance does this automatically IIRC.

@mbuechse i am on it, the connection.network.default_security_group_rules() seems to be outdated

josephineSei commented 2 months ago

@fraugabel and I had a discussion about the Default Security Group Rules Test. We found something very important:

The API I use to test the Default Security Group Rules is pretty new. In the 2023.2 release it was added: "A new API which allows a cloud administrator to define their own set of security group rules added automatically to every new default and/or custom security group created for projects."

But Yaook as it is used for the test case in Leipzig does not yet support this or newer releases. As we will have this problem in future with other standards too (e.g. Domain Manager or the roles standard), it would be important to close the gap between new OpenStack releases and Yaook.

Up until then we discussed adding backwards-compatibility to the Test. Such like: if the new API does not exist, we will fall back to 1. creating a new security group, 2. checking that there only exist 2 rules allowing all egress traffic in this groups and 3. delete the security group afterwards.

fraugabel commented 2 months ago

backwards-compatibility to the Test is added and tested on Yaook in Leipzig: creating a new security group initializes automaticallly two egress rules (IPv4 and IPv6) but no ingress rule, this information can be successfully requested and confirmed by the test.

fraugabel commented 2 months ago

SCS-compliance of Yaook concerning the Key-Manager-Standard failed, for users with the 'member' role, are not authorized to use the Key-Manager.

User has member role.
User has reader role.
Users with the 'member' role can use Key Manager API: FAILAccording to the Key Manager Standard, users with the'member' role should be able to use the Key Manager API.
ERROR: ForbiddenException: 403: Client Error for url: https://key-manager.l1a.cloudandheat.com:443/v1/secrets, Forbidden
markus-hentsch commented 2 months ago

SCS-compliance of Yaook concerning the Key-Manager-Standard failed, for users with the 'member' role, are not authorized to use the Key-Manager.

User has member role.
User has reader role.
Users with the 'member' role can use Key Manager API: FAILAccording to the Key Manager Standard, users with the'member' role should be able to use the Key Manager API.
ERROR: ForbiddenException: 403: Client Error for url: https://key-manager.l1a.cloudandheat.com:443/v1/secrets, Forbidden

Just chiming in here to add some context: this is expected for OpenStack releases earlier than the upcoming 2024.2 ("Dalmatian") release in conjunction with Yaook. The upcoming release will default [oslo.policy]enforce_new_defaults in barbican.conf to True, which in turn migrates Barbican to the new unified role model (reader, member, admin).

For any current Barbican releases, this is not the case and Barbican will still ship its own custom roles^1. Since Yaook mainly uses Barbican's default policies^2, this prevents users with the member role to access the functionalities. This can be addressed in two ways:

  1. Set [oslo.policy]enforce_new_defaults=True in barbican.conf (not recommended for OpenStack releases older than 2024.1)
  2. Redirect Barbican's creator role to member

Approach no. 2 is easily implemented in Yaook by adjusting the BarbicanDeployment manifest template:

kind: BarbicanDeployment
...
spec:
  ...
  policy:
    "admin": "role:admin"
    "creator": "role:member"
josephineSei commented 2 months ago

SCS-compliance of Yaook concerning the Key-Manager-Standard failed, for users with the 'member' role, are not authorized to use the Key-Manager.

User has member role.
User has reader role.
Users with the 'member' role can use Key Manager API: FAILAccording to the Key Manager Standard, users with the'member' role should be able to use the Key Manager API.
ERROR: ForbiddenException: 403: Client Error for url: https://key-manager.l1a.cloudandheat.com:443/v1/secrets, Forbidden

Just chiming in here to add some context: this is expected for OpenStack releases earlier than the upcoming 2024.2 ("Dalmatian") release in conjunction with Yaook. The upcoming release will default [oslo.policy]enforce_new_defaults in barbican.conf to True, which in turn migrates Barbican to the new unified role model (reader, member, admin).

For any current Barbican releases, this is not the case and Barbican will still ship its own custom roles1. Since Yaook mainly uses Barbican's default policies2, this prevents users with the member role to access the functionalities. This can be addressed in two ways:

1. Set `[oslo.policy]enforce_new_defaults=True` in `barbican.conf` (not recommended for OpenStack releases older than 2024.1)

2. Redirect Barbican's `creator` role to `member`

Approach no. 2 is easily implemented in Yaook by adjusting the BarbicanDeployment manifest template:

kind: BarbicanDeployment
...
spec:
  ...
  policy:
    "admin": "role:admin"
    "creator": "role:member"

Footnotes

1. https://docs.openstack.org/barbican/2024.1/admin/access_control.html#default-policy [↩](#user-content-fnref-1-2ebd40ef87e5f42586bd5c73951322ee)

2. https://gitlab.com/yaook/operator/-/blob/85dc33e7a9f982c841c951e693d6a0a538bcfc7c/docs/examples/barbican.yaml#L44-45 [↩](#user-content-fnref-2-2ebd40ef87e5f42586bd5c73951322ee)

This indeed should be standard in yaook, to be able to use Barbican as a normal user (those usually have only the member role). At least until the new secure RBAC will be enabled by default - which seems to not be the case even in 2024.2, when I look at @markus-hentsch comments in the role standard

anjastrunk commented 3 hours ago

@josephineSei @fraugabel @markus-hentsch All associated tasks are done. Can this issue be closed?

mbuechse commented 1 hour ago

Yes, consider it done. We need to adapt a few things in the PoC for key manager and volume backup, but that is already underway.

josephineSei commented 1 hour ago

Concerning the additional three standards:

Availability Zone Standard

AZs are not a must have, but we should have dedicated fire zones, when we want to establish a AZ. This is done per deployment, so we can consider yaook as compatible to this standard.

Mandatory Services

Yaook needs to also roll out a Load-Balancer like Octavia. This is currently under development

Backup Service for Volumes

This just needs to be enabled as a standard. This should not be a big blocker.