ComplianceAsCode / compliance-operator

Operator providing Kubernetes cluster compliance checks
Apache License 2.0
36 stars 22 forks source link

CMP-2693: Use CLI image for base image in must-gather #543

Closed rhmdnd closed 1 month ago

rhmdnd commented 1 month ago

The original approach for building a must-gather image specifically for Compliance Operator usage grabbed the latest stock must-gather image (the one for collecting everything in an ordinary deployment), and then pushing the original scripts out of the way and replacing them with Compliance Operator specific scripts to collect the information we wanted.

While this worked, we can simplify the image dependency by just relying on the CLI image directly, since that's what the upstream must-gather image does, then just wire up the entry point to the collection scripts we already have, following the same pattern that the upstream must-gather image uses.

This commit also updates the Dockerfile name to include .ocp suffix, since we're relying on a Red Hat registry to pull the CLI image.

openshift-ci-robot commented 1 month ago

@rhmdnd: This pull request references CMP-2693 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.17.0" version, but no target version was set.

In response to [this](https://github.com/ComplianceAsCode/compliance-operator/pull/543): >The original approach for building a must-gather image specifically for >Compliance Operator usage grabbed the latest stock must-gather image >(the one for collecting everything in an ordinary deployment), and then >pushing the original scripts out of the way and replacing them with >Compliance Operator specific scripts to collect the information we >wanted. > >While this worked, we can simplify the image dependency by just relying >on the CLI image directly, since that's what the upstream must-gather >image does, then just wire up the entry point to the collection scripts >we already have, following the same pattern that the upstream >must-gather image uses. > >This commit also updates the `Dockerfile` name to include `.ocp` suffix, >since we're relying on a Red Hat registry to pull the CLI image. > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=ComplianceAsCode%2Fcompliance-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
rhmdnd commented 1 month ago

Hint for reviewers is to build the new image locally, and it should work with oc adm must-gather --image=$IMAGE.

rhmdnd commented 1 month ago

/jira-refresh

BhargaviGudi commented 1 month ago

/hold for test

rhmdnd commented 1 month ago

/test e2e-aws-parallel

Failure during metrics tests... which may be transient.

github-actions[bot] commented 1 month ago

:robot: To deploy this PR, run the following command:

make catalog-deploy CATALOG_IMG=ghcr.io/complianceascode/compliance-operator-catalog:543
BhargaviGudi commented 1 month ago

I'm unable to deploy the PR. @rhmdnd Could you please help me deploy the pr?

$ make catalog-deploy CATALOG_IMG=ghcr.io/complianceascode/compliance-operator-catalog:543
namespace/openshift-compliance created
Replacing image reference in config/catalog/catalog-source.yaml
catalogsource.operators.coreos.com/compliance-operator configured
Restoring image reference in config/catalog/catalog-source.yaml
Replacing namespace reference in config/catalog/operator-group.yaml
operatorgroup.operators.coreos.com/compliance-operator created
Restoring namespace reference in config/catalog/operator-group.yaml
Replacing namespace reference in config/catalog/subscription.yaml
subscription.operators.coreos.com/compliance-operator-sub created
Restoring namespace reference in config/catalog/subscription.yaml
$ oc project openshift-compliance 
Already on project "openshift-compliance" on server "https://api.bgudi-spo.qe.gcp.devcluster.openshift.com:6443".
$ oc get sub
NAME                      PACKAGE               SOURCE                CHANNEL
compliance-operator-sub   compliance-operator   compliance-operator   alpha
$ oc get pods
No resources found in openshift-compliance namespace.
$ oc get pb
error: the server doesn't have a resource type "pb"
Vincent056 commented 1 month ago

I'm unable to deploy the PR. @rhmdnd Could you please help me deploy the pr?

$ make catalog-deploy CATALOG_IMG=ghcr.io/complianceascode/compliance-operator-catalog:543
namespace/openshift-compliance created
Replacing image reference in config/catalog/catalog-source.yaml
catalogsource.operators.coreos.com/compliance-operator configured
Restoring image reference in config/catalog/catalog-source.yaml
Replacing namespace reference in config/catalog/operator-group.yaml
operatorgroup.operators.coreos.com/compliance-operator created
Restoring namespace reference in config/catalog/operator-group.yaml
Replacing namespace reference in config/catalog/subscription.yaml
subscription.operators.coreos.com/compliance-operator-sub created
Restoring namespace reference in config/catalog/subscription.yaml
$ oc project openshift-compliance 
Already on project "openshift-compliance" on server "https://api.bgudi-spo.qe.gcp.devcluster.openshift.com:6443".
$ oc get sub
NAME                      PACKAGE               SOURCE                CHANNEL
compliance-operator-sub   compliance-operator   compliance-operator   alpha
$ oc get pods
No resources found in openshift-compliance namespace.
$ oc get pb
error: the server doesn't have a resource type "pb"

you would have to build image and use it, we don't have workflow to build PR image for the must-gather yet

rhmdnd commented 1 month ago

I'm unable to deploy the PR. @rhmdnd Could you please help me deploy the pr?

$ make catalog-deploy CATALOG_IMG=ghcr.io/complianceascode/compliance-operator-catalog:543
namespace/openshift-compliance created
Replacing image reference in config/catalog/catalog-source.yaml
catalogsource.operators.coreos.com/compliance-operator configured
Restoring image reference in config/catalog/catalog-source.yaml
Replacing namespace reference in config/catalog/operator-group.yaml
operatorgroup.operators.coreos.com/compliance-operator created
Restoring namespace reference in config/catalog/operator-group.yaml
Replacing namespace reference in config/catalog/subscription.yaml
subscription.operators.coreos.com/compliance-operator-sub created
Restoring namespace reference in config/catalog/subscription.yaml
$ oc project openshift-compliance 
Already on project "openshift-compliance" on server "https://api.bgudi-spo.qe.gcp.devcluster.openshift.com:6443".
$ oc get sub
NAME                      PACKAGE               SOURCE                CHANNEL
compliance-operator-sub   compliance-operator   compliance-operator   alpha
$ oc get pods
No resources found in openshift-compliance namespace.
$ oc get pb
error: the server doesn't have a resource type "pb"

you would have to build image and use it, we don't have workflow to build PR image for the must-gather yet

You should be able to do this by overriding IMAGE_REPO and calling make must-gather, then test it on your cluster using oc adm must-gather --image=$IMAGE_REPO where $IMAGE_REPO include the location of the must-gather image you just built.

BhargaviGudi commented 1 month ago

Verification failed at deployment step with below error

$ make must-gather
podman build -t quay.io/bgudi/must-gather-ocp:pr-543-3 -f images/must-gather/Dockerfile.ocp .
STEP 1/3: FROM registry.ci.openshift.org/ocp/4.17:cli
Trying to pull registry.ci.openshift.org/ocp/4.17:cli...
Getting image source signatures
Copying blob c9fa557c0820 done   | 
Copying blob ca1636478fe5 done   | 
Copying blob a9d0d9953a70 done   | 
Copying blob ce7b0a7ae065 done   | 
Copying blob 10da2e8c15f4 done   | 
Copying config 98dd9c8b3c done   | 
Writing manifest to image destination
STEP 2/3: COPY utils/must-gather/* /usr/bin/
--> 6fea5b332b0c
STEP 3/3: ENTRYPOINT /usr/bin/gather
COMMIT quay.io/bgudi/must-gather-ocp:pr-543-3
--> 0b9753f35765
Successfully tagged quay.io/bgudi/must-gather-ocp:pr-543-3
0b9753f35765e1d6fa23fd4ba879c4ab0ba9f35f856939ec82f56e59fd003e4a
make: *** No rule to make target 'must-gather-push', needed by 'must-gather'.  Stop.

PR #546 has been raised to fix the issue.

BhargaviGudi commented 1 month ago

Verification failed with 4.17.0-0.nightly-2024-07-07-131215 + #543 + Copied the lines of #546 into Makefile cc: @rhmdnd

$ make must-gather
podman build -t quay.io/bgudi/must-gather-ocp:pr-543-new -f images/must-gather/Dockerfile.ocp .
STEP 1/3: FROM registry.ci.openshift.org/ocp/4.17:cli
STEP 2/3: COPY utils/must-gather/* /usr/bin/
--> Using cache 6fea5b332b0cbcf7d7175f3e54c93494eb8e91b72db7d6baa9f0e3c2fd306544
--> 6fea5b332b0c
STEP 3/3: ENTRYPOINT /usr/bin/gather
--> Using cache 0b9753f35765e1d6fa23fd4ba879c4ab0ba9f35f856939ec82f56e59fd003e4a
COMMIT quay.io/bgudi/must-gather-ocp:pr-543-new
--> 0b9753f35765
Successfully tagged quay.io/bgudi/must-gather-ocp:pr-543-new
0b9753f35765e1d6fa23fd4ba879c4ab0ba9f35f856939ec82f56e59fd003e4a
podman push quay.io/bgudi/must-gather-ocp:pr-543-new
Getting image source signatures
Copying blob 6dc07b7592a6 done   | 
Copying blob 06aab3d1c910 done   | 
Copying blob bd7f057b23b0 done   | 
Copying blob a48d1a32c2f5 done   | 
Copying blob 9e1ee7e8fbb6 done   | 
Copying blob 7820e945a904 done   | 
Copying config 0b9753f357 done   | 
Writing manifest to image destination

Executed $ oc adm must-gather --image=quay.io/bgudi/must-gather-ocp Observed below error

Reprinting Cluster State:
When opening a support case, bugzilla, or issue please include the following summary data along with any other requested information:
ClusterID: 93b3d70b-2203-4268-bff2-c1da5915bef1
ClusterVersion: Stable at "4.17.0-0.nightly-2024-07-07-131215"
ClusterOperators:
    All healthy and stable

error: gather did not start for pod must-gather-x884m: unable to pull image: ImagePullBackOff: Back-off pulling image "quay.io/bgudi/must-gather-ocp"

Check below files for more details must-gather-543.txt must-gather.local.7328280254091606881.zip

prb112 commented 1 month ago

quay.io/bgudi/must-gather-ocp:pr-543-new

you should use oc adm must-gather --image=quay.io/bgudi/must-gather-ocp:pr-543-new

Right?

BhargaviGudi commented 1 month ago

@prb112 Still observing the same error with oc adm must-gather --image=quay.io/bgudi/must-gather-ocp:pr-543-new command error: gather did not start for pod must-gather-n7mkh: unable to pull image: ImagePullBackOff: Back-off pulling image "quay.io/bgudi/must-gather-ocp:pr-543-new"

prb112 commented 1 month ago

❯ crane manifest quay.io/bgudi/must-gather-ocp:pr-543-new Error: fetching manifest quay.io/bgudi/must-gather-ocp:pr-543-new: GET https://quay.io/v2/bgudi/must-gather-ocp/manifests/pr-543-new: UNAUTHORIZED: access to the requested resource is not authorized; map[]

I think your repo is private

BhargaviGudi commented 1 month ago

Verification passed with 4.17.0-0.nightly-2024-07-07-131215 + #543 + Copied conent of #546 into Makefile

  1. Install CO
  2. make must-gather
    $ make must-gather
    podman build -t quay.io/bgudi/must-gather-ocp:pr-543-1 -f images/must-gather/Dockerfile.ocp .
    STEP 1/3: FROM registry.ci.openshift.org/ocp/4.17:cli
    STEP 2/3: COPY utils/must-gather/* /usr/bin/
    --> Using cache 6fea5b332b0cbcf7d7175f3e54c93494eb8e91b72db7d6baa9f0e3c2fd306544
    --> 6fea5b332b0c
    STEP 3/3: ENTRYPOINT /usr/bin/gather
    --> Using cache 0b9753f35765e1d6fa23fd4ba879c4ab0ba9f35f856939ec82f56e59fd003e4a
    COMMIT quay.io/bgudi/must-gather-ocp:pr-543-1
    --> 0b9753f35765
    Successfully tagged quay.io/bgudi/must-gather-ocp:pr-543-1
    Successfully tagged quay.io/bgudi/must-gather-ocp:pr-543-new
    0b9753f35765e1d6fa23fd4ba879c4ab0ba9f35f856939ec82f56e59fd003e4a
    podman push quay.io/bgudi/must-gather-ocp:pr-543-1
    Getting image source signatures
    Copying blob e57c44f82922 skipped: already exists  
    Copying blob ce7b0a7ae065 skipped: already exists  
    Copying blob 3059f6068401 skipped: already exists  
    Copying blob c9fa557c0820 skipped: already exists  
    Copying blob 10da2e8c15f4 skipped: already exists  
    Copying blob a9d0d9953a70 skipped: already exists  
    Copying config 0b9753f357 done   | 
    Writing manifest to image destination
  3. $ oc adm must-gather --image=quay.io/bgudi/must-gather-ocp:pr-543-1 must-gather-543.txt must-gather.local.3078308017389767560.zip
BhargaviGudi commented 1 month ago

/unhold

BhargaviGudi commented 1 month ago

/label qe-approved

openshift-ci-robot commented 1 month ago

@rhmdnd: This pull request references CMP-2693 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target either version "4.17." or "openshift-4.17.", but it targets "compliance-operator-1.6.0" instead.

In response to [this](https://github.com/ComplianceAsCode/compliance-operator/pull/543): >The original approach for building a must-gather image specifically for >Compliance Operator usage grabbed the latest stock must-gather image >(the one for collecting everything in an ordinary deployment), and then >pushing the original scripts out of the way and replacing them with >Compliance Operator specific scripts to collect the information we >wanted. > >While this worked, we can simplify the image dependency by just relying >on the CLI image directly, since that's what the upstream must-gather >image does, then just wire up the entry point to the collection scripts >we already have, following the same pattern that the upstream >must-gather image uses. > >This commit also updates the `Dockerfile` name to include `.ocp` suffix, >since we're relying on a Red Hat registry to pull the CLI image. > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=ComplianceAsCode%2Fcompliance-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rhmdnd, Vincent056, yuumasato

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/ComplianceAsCode/compliance-operator/blob/master/OWNERS)~~ [Vincent056,rhmdnd] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
BhargaviGudi commented 1 month ago

/hold for a while as there is some confusion with image used.

BhargaviGudi commented 1 month ago

/unhold

rhmdnd commented 1 month ago

/test e2e-rosa

ROSA e2e failed on unrelated authentication errors.

yuumasato commented 1 month ago

/test e2e-aws-parallel