openshift / compliance-operator

Operator providing OpenShift cluster compliance checks
Apache License 2.0
110 stars 110 forks source link

Enable runtime customization of API resource collection paths for HyperShift support #750

Closed tmishina closed 2 years ago

tmishina commented 3 years ago

Background

When Compliance Operator tries to perform OpenShift configuration check, its API resource collector pod collects check target Kubernetes API resources. For example, many Kube API server configuration rules check the Kubernetes API resource /api/v1/namespaces/openshift-kube-apiserver/configmaps/config. The path (/api...) is hard-coded in each rule (example), and therefore it is impossible to specify an API resource path at runtime.

Issue

HyperShift is an enabler for managed OpenShift clusters. Master-plane components of a guest cluster in the hypershift environment runs on a separate namespace (per-guest-cluster namespace) of a management cluster (see architecture document of hypershift in details).

Example: In a standard installation, a Kubernetes API server pod and its configuration exists in namespace openshift-kube-apiserver. However, in a hypershift environment, a Kubernetes API server pod and its configuration are located in a per-guest-cluster namespace (such as master-guestcluster1). The name of the namespace varies, and therefore the hard-coded path doesn't matter.

Solution (New Feature of the PR)

Change the API resource collector to read the name of an XCCDF variable from a rule, and collect an API resource from the path specified as an XCCDF variable.

Also, changes on ComplianceAsCode/content is required. (One discussion point related to the rule modification is that filepath is still required in the rule to build the rules while actual filepath is externalized)

openshift-ci[bot] commented 3 years ago

Hi @tmishina. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
JAORMX commented 3 years ago

@tmishina we already pull in the infrastructure/cluster resource, which could be queried to determine if we're in a HyperShift cluster. There is an open PR [1] that will enable us to query the controlPlaneTopology key in the status, which should be External if we're in a managed HyperShift cluster.

We could then just add a CPE https://github.com/ComplianceAsCode/content/blob/master/shared/checks/oval/installed_app_is_ocp4.xml to determine if we should run a check. e.g. installed_app_is_ocp4_hypershift or something of the sort...

[1] https://github.com/openshift/hypershift/pull/539

tmishina commented 3 years ago

Thank you for pointing the PR. My understanding is that the feature of the PR (assigning a special label External to labels or annotations) is useful to automatically select rules in a guest cluster (red-colored ones in the dialog below). However, the feature of this PR is still needed for Compliance Operator instances running on a management cluster (blue one) because the name of namespaces for master-plane components cannot be determined at the time of rule build. I think two PRs are complementary.

co-hs-architecture-diagram
JAORMX commented 3 years ago

Thank you for pointing the PR. My understanding is that the feature of the PR (assigning a special label External to labels or annotations) is useful to automatically select rules in a guest cluster (red-colored ones in the dialog below). However, the feature of this PR is still needed for Compliance Operator instances running on a management cluster (blue one) because the name of namespaces for master-plane components cannot be determined at the time of rule build. I think two PRs are complementary.

co-hs-architecture-diagram

gotcha, I see where you're coming from. Now... wouldn't we then only want to enable variable support for a single part of the API? e.g. have something like:

<html:code class="ocp-api-endpoint">/api/v1/namespaces/<xccdf-1.2:sub idref="xccdf_org.ssgproject.content_value_foo" use="legacy"/>/configmaps/config</html:code>

This would allow you to reference variables and even be able to set them dynamically using TailoredProfiles if needed.

tmishina commented 3 years ago

When will an element <xccdf-1.2:sub> be replaced with its referring value? If my understanding is correct, an API resource check pod runs as follows, and the substitution will take place at step 3, and therefore API resource collector cannot gain the benefit of using <xccdf-1.2:sub>.

  1. init container content-container extracts content (data stream) from image to local (temporal) storage
  2. init container api-resource-collector reads the data stream and collect API resources specified in <html:code>
  3. container scanner performs oscap command
  4. container log-collector collects create a report from the scan result
JAORMX commented 3 years ago

When will an element <xccdf-1.2:sub> be replaced with its referring value? If my understanding is correct, an API resource check pod runs as follows, and the substitution will take place at step 3, and therefore API resource collector cannot gain the benefit of using <xccdf-1.2:sub>.

1. init container `content-container` extracts content (data stream) from image to local (temporal) storage

2. init container `api-resource-collector` reads the data stream and collect API resources specified in `<html:code>`

3. container `scanner` performs `oscap` command

4. container `log-collector` collects create a report from the scan result

right, the intention of my comment was to say that if we want this functionality to work, we could add substitution functionality to the Compliance Operator for this kind of things. The functionality is not there yet, so it needs to be coded, and probably take part of step 2.

tmishina commented 3 years ago

OK, then let me summarize tasks.

  1. implement a macro function (in macros.jinja) which performs:
    1. embedding <external_variable> to XCCDF data stream
    2. embedding <xccdf-1.2:sub> to <code class="ocp-api-endpoint"> in the XCCDF data stream
  2. implement a new functionality (in API resource collector) which performs substitution specified with <xccdf-1.2:sub>

One thing I'd like to know is how to embed <external_variable>. The macro will be called in .warnings.general in rule.yml, and the output of macro goes under <xccdf-1.2:warning> element. if my understanding is correct, the <external_variable> element should be located at upper level (e.g., direct child of <xccdf-1.2:Profile>). Is there any way to do so?

Vincent056 commented 3 years ago

@JAORMX Would this be similar to how we handle remediation templating that we can use a tailored profile to set a value, in this case, it is a custom API resource name?

JAORMX commented 3 years ago

OK, then let me summarize tasks.

1. implement a macro function (in [macros.jinja](https://github.com/ComplianceAsCode/content/blob/master/shared/macros.jinja)) which performs:

   1. embedding `<external_variable>` to XCCDF data stream
   2. embedding `<xccdf-1.2:sub>` to `<code class="ocp-api-endpoint">` in the XCCDF data stream

2. implement a new functionality (in [API resource collector](https://github.com/openshift/compliance-operator/blob/master/pkg/utils/parse_arf_result.go#L84-L111)) which performs substitution specified with `<xccdf-1.2:sub>`

One thing I'd like to know is how to embed <external_variable>. The macro will be called in .warnings.general in rule.yml, and the output of macro goes under <xccdf-1.2:warning> element. if my understanding is correct, the <external_variable> element should be located at upper level (e.g., direct child of <xccdf-1.2:Profile>). Is there any way to do so?

@tmishina : The warning would contain xccdf-1.2:sub indicating that a substitution needs to be made. The variable needs to be set somewhere... it could be on a Profile as you say. But most likely than not it'll be in a tailoring. @Vincent056 did very similar work to substitute variables on remediations, where, we take the default value of a variable if it isn't set, else, we look a the tailoring.

The idea is that we'd be able to substitute the aspects that are unique for a deployment, and we'd set them up in a tailoring when needed. And the default case would then just work for regular OpenShift.

@JAORMX Would this be similar to how we handle remediation templating that we can use a tailored profile to set a value, in this case, it is a custom API resource name?

@Vincent056 exactly!

Vincent056 commented 3 years ago

@tmishina Here are a few PRs I think might be helpful, We are currently rendering the XCCDF variable's value in the rule's description and rational [1] section, here we get the default value list from datastream, and we then parse all the rules, we replace the place where an XCCDF variable present to the default value using the value list.

Also, we are using the XCCDF variable in remediations,[2][3] the way we substitute value is similar to the one above, however, we only get XCCDF variable value from result ConfigMap[6], which is the values used to do the scanning, the values could be from datastream default value as well as tailored profile, so a little correction to what @JAORMX mentioned about tailor profile above. However, we are indeed using tailored profile to check if the variable is set in a tailored profile if a certain value is in value required annotation[4] , I think you can change the isRequiredValueSet function[5] to extract the value you needed from the tailored profile, and then parse them in warning section.

I think what you can do is to create a new variable, and set the default value in the compliance as code content that is used to generate the datastream (ex. [7]), and then use XCCDF variable to replace openshift-kube-apiserver to that variable you create. In compliance operator, you can first look for value in tailored profile, if not found, then it will go to datastream to use the default value(ex. in this case, the default value could be openshift-kube-apiserver). , so that the implementation will still work for the normal openshift cluster if no value is provided in the tailored profile.

@JAORMX Does this sounds good to you?

Link:

[1] https://github.com/openshift/compliance-operator/pull/738 [2] https://github.com/openshift/compliance-operator/pull/672 [3] https://github.com/openshift/compliance-operator/pull/727 [4] https://github.com/openshift/compliance-operator/blob/master/doc/remediation-templating.md [5] https://github.com/openshift/compliance-operator/blob/88f0dc5103ceb2edc2594bfc040842eb667ac822/pkg/controller/complianceremediation/complianceremediation_controller.go#L458 [6] https://github.com/openshift/compliance-operator/blob/88f0dc5103ceb2edc2594bfc040842eb667ac822/pkg/utils/parse_arf_result.go#L396 [7] https://github.com/ComplianceAsCode/content/blob/master/applications/openshift/kubelet/var_kubelet_evictionhard_imagefs_available.var

tmishina commented 3 years ago

@Vincent056 Thank you very much, I think expanding the substitution feature from description element to warning element to enables the API resource collector pod to collect a resource specified in a variable.

Another issue is that we need to specify the check file target in a template yamlfile_value. To solve the issue, I tired to add a new element filepath_xccdf_variable to tell the name of variable to the yamlfile_value template.

I had tried to reuse the variable designated in warning element, but for me it is impossible because the variable specified in warning element cannot been referred in oval.template.

Do you have any idea to solve the issue without using the additional variable (filepath_xccdf_variable)?

JAORMX commented 3 years ago

@tmishina do you have a PR already in ComplianceAsCode? Maybe we could discuss solutions there instead and include the maintainers of CaC. They might be able to help.

tmishina commented 3 years ago

@JAORMX Thank you for suggestion, I've note created a PR on ComplianceAsCode/content yet. I will do that.

I have tried to implement the idea discussed on this PR. https://github.com/openshift/compliance-operator/compare/master...tmishina:parameter-subst-in-warnings

I will write test for this (I'm learning the test of this project; scap_test.go looks like the best position to add test for this new feature).

tmishina commented 3 years ago

@JAORMX I have implemented test, and changed the code of the source branch custom-api-resource from my old one to the one we are discussing now (merged sub-branch parameter-subst-in-warnings into the source branch). I'd like to hear your comment on the following topics, thanks!

Vincent056 commented 3 years ago

@Vincent056 Thank you very much, I think expanding the substitution feature from description element to warning element to enables the API resource collector pod to collect a resource specified in a variable.

Another issue is that we need to specify the check file target in a template yamlfile_value. To solve the issue, I tired to add a new element filepath_xccdf_variable to tell the name of variable to the yamlfile_value template.

* Example rule: https://github.com/tmishina/content/blob/hypershift-support/applications/openshift/api-server/api_server_admission_control_plugin_AlwaysAdmit/rule.yml#L45

* change in `oval.template` of `yamlfile_value` template: https://github.com/tmishina/content/blob/hypershift-support/shared/templates/yamlfile_value/oval.template#L21-L31

I had tried to reuse the variable designated in warning element, but for me it is impossible because the variable specified in warning element cannot been referred in oval.template.

Do you have any idea to solve the issue without using the additional variable (filepath_xccdf_variable)?

Agree with @JAORMX I think it might be good to get CaC repo maintainers involved on this one.

tmishina commented 2 years ago

I found important issue at the test with my modified ComplianceAsCode/content.

The issue is that customized variables specified in a tailored profile resource does not affect the API resource collector, and it collects resources specified as default value for each variable. It ignores customized values in the tailored profile resource. That is not an expected behaviour.

The reason is that API resource collector checks warnings elements in the data stream stored in a ComplianceAsCode/content image (no substitution), not in a profile resource (substituted). (I've not checked whether this also occurs in description/rationale elements)

As far as I know, there are two solution candidates.

  1. use my first code (introduce special macro to embed variable reference in a data stream image and parse it in CO)
    • pros: small modification
    • cons: substitution mechanism differ than that for description/rationale of profile resource
  2. modify the API resource collector to use warnings in kube resource instead of data stream
    • pros: same substitution mechanism as description/rationale
      • cons: a number of (maybe 10-100 order) lines should be changed (mainly at scap.go)

My preference is 1, but @Vincent056, @JAORMX, I'd appreciate it if I can hear your opinions, thanks.

Vincent056 commented 2 years ago

/ok-to-test

openshift-ci[bot] commented 2 years ago

@tmishina: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
openshift-ci[bot] commented 2 years ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tmishina, Vincent056

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/compliance-operator/blob/master/OWNERS)~~ [Vincent056] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment