Closed StevenBarre closed 9 months ago
Asking question Matt in RC: https://chat.developer.gov.bc.ca/channel/devops-operations-lab?msg=CqfeWinoAiPEhH7P8
Summary of my investigation :
When an operator is installed by OLM, a stripped-down copy of its CSV is created in every namespace the operator is configured to watch. These stripped-down CSVs are known as “Copied CSVs” and communicate to users which controllers are actively reconciling resource events in a given namespace. To support larger clusters, OLM allows users to disable Copied CSVs for operators installed in the AllNamespace
mode by setting the cluster olmConfig’s spec.features.disableCopiedCSVs
field to true
.
If an operator is installed in a different namespace than the one it is configured to reconcile events in, users will not be able to view the operator in the OperatorHub or CLI. However, operators affected by this limitation are still available and continue to reconcile events in the user’s namespace. Users can still interact with the operator's custom resources using kubectl
or oc
commands.
Need to confirm these in Lab Clusters.
Disabling Copied CSV in AMS ROSA for testing.
ADVSOL-AMS/redhat-ods-operator ~ $ oc get OLMConfig cluster
NAME AGE
cluster 260d
ADVSOL-AMS/redhat-ods-operator ~ $ oc apply -f - <<EOF
> apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
name: cluster
spec:
features:
disableCopiedCSVs: true
> EOF
Warning: resource olmconfigs/cluster is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by oc apply. oc apply should only be used on resources created declaratively by either oc create --save-config or oc apply. The missing annotation will be patched automatically.
olmconfig.operators.coreos.com/cluster configured
// Status
$ oc get OLMConfig -o yaml
apiVersion: v1
items:
- apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
<...>
spec:
features:
disableCopiedCSVs: true
status:
conditions:
- lastTransitionTime: "2023-04-14T23:47:23Z"
message: Copied CSVs are disabled and no unexpected copied CSVs were found for
operators installed in AllNamespace mode
reason: CopiedCSVsDisabled
status: "True"
type: DisabledCopiedCSVs
<...>
// Events in openshift-operators also shows below;
ADVSOL-AMS/default ~ $ oc get events -n openshift-operators | grep DisabledCopiedCSVs
2m5s Warning DisabledCopiedCSVs clusterserviceversion/devworkspace-operator.v0.19.1-0.1679521112.p CSV copying disabled for openshift-operators/devworkspace-operator.v0.19.1-0.1679521112.p
2m5s Warning DisabledCopiedCSVs clusterserviceversion/jaeger-operator.v1.43.0 CSV copying disabled for openshift-operators/jaeger-operator.v1.43.0
2m5s Warning DisabledCopiedCSVs clusterserviceversion/kiali-operator.v1.57.6 CSV copying disabled for openshift-operators/kiali-operator.v1.57.6
2m5s Warning DisabledCopiedCSVs clusterserviceversion/servicemeshoperator.v2.3.2 CSV copying disabled for openshift-operators/servicemeshoperator.v2.3.2
2m5s Warning DisabledCopiedCSVs clusterserviceversion/web-terminal.v1.7.0-0.1681197295.p CSV copying disabled for openshift-operators/web-terminal.v1.7.0-0.1681197295.p
Keep this running over the weekend and will apply to KLAB/KLAB2 nextweek.
Some more testing on ROSA cluster - disable/enable Copied CSV
// When `disableCopiedCSVs: true` - No CSVs can be seen in the user namespace
ADVSOL-AMS/tats ~ $ oc get csv
No resources found in tats namespace.
// Turn off disabling again -- `disableCopiedCSVs: false`.
apiVersion: v1
items:
- apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
<...>
spec:
features:
disableCopiedCSVs: false
// Copied CSVs in the user namespace are coming back
ADVSOL-AMS/tats ~ $ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
devworkspace-operator.v0.19.1-0.1679521112.p DevWorkspace Operator 0.19.1+0.1679521112.p devworkspace-operator.v0.19.1 Succeeded
elasticsearch-operator.v5.6.4 OpenShift Elasticsearch Operator 5.6.4 elasticsearch-operator.v5.6.3 Succeeded
jaeger-operator.v1.44.0 Community Jaeger Operator 1.44.0 jaeger-operator.v1.43.0 Succeeded
kiali-operator.v1.57.6 Kiali Operator 1.57.6 kiali-operator.v1.57.5 Succeeded
observability-operator.v0.0.20 Observability Operator 0.0.20 observability-operator.v0.0.19 Succeeded
rhods-operator.1.24.0 Red Hat OpenShift Data Science 1.24.0 rhods-operator.1.23.0 Succeeded
route-monitor-operator.v0.1.493-a866e7c Route Monitor Operator 0.1.493-a866e7c route-monitor-operator.v0.1.489-7d9fe90 Succeeded
servicemeshoperator.v2.3.2 Red Hat OpenShift Service Mesh 2.3.2-0 servicemeshoperator.v2.3.1 Succeeded
web-terminal.v1.7.0-0.1681197295.p Web Terminal 1.7.0+0.1681197295.p web-terminal.v1.7.0 Succeeded
ADVSOL-AMS/tats ~ $
disableCopiedCSVs: false
disableCopiedCSVs: true
ADVSOL-AMS/tats ~ $ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
cloud-native-postgresql.v1.19.1 EDB Postgres for Kubernetes 1.19.1 Succeeded
I will run the same tests on K/CLAB next
Tested on KLAB and KLAB2. Both got same results as the ROSA's test above.
// create a project
NSX KLAB2/default ~ $ oc new-project tats-test
Now using project "tats-test" on server "https://api.klab2.devops.gov.bc.ca:6443".
You can add applications to this project with the 'new-app' command. For example, try:
oc new-app rails-postgresql-example
to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:
kubectl create deployment hello-node --image=k8s.gcr.io/e2e-test-images/agnhost:2.33 -- /agnhost serve-hostname
// check copied CSV
NSX KLAB2/tats-test ~ $ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
amqstreams.v2.3.0-3 AMQ Streams 2.3.0-3 amqstreams.v2.3.0-2 Succeeded
devworkspace-operator.v0.19.1-0.1679521112.p DevWorkspace Operator 0.19.1+0.1679521112.p devworkspace-operator.v0.18.1-0.1675929565.p Succeeded
eap-operator.v2.3.10 JBoss EAP 2.3.10 eap-operator.v2.3.9 Succeeded
elasticsearch-operator.5.5.7 OpenShift Elasticsearch Operator 5.5.7 elasticsearch-operator.5.5.6 Succeeded
must-gather-operator.v1.1.2 Must Gather Operator 1.1.2 must-gather-operator.v1.1.1 Succeeded
openshift-gitops-operator.v1.6.6 Red Hat OpenShift GitOps 1.6.6 openshift-gitops-operator.v1.6.5 Succeeded
openshift-pipelines-operator-rh.v1.8.2 Red Hat OpenShift Pipelines 1.8.2 Succeeded
red-hat-camel-k-operator.v1.8.2-0.1675913507.p Red Hat Integration - Camel K 1.8.2+0.1675913507.p red-hat-camel-k-operator.v1.6.10 Succeeded
rhacs-operator.v3.74.2 Advanced Cluster Security for Kubernetes 3.74.2 rhacs-operator.v3.74.1 Succeeded
service-registry-operator.v2.1.4 Red Hat Integration - Service Registry Operator 2.1.4 service-registry-operator.v2.1.3 Succeeded
NSX KLAB2/tats-test ~ $
// Disable copied CSV
oc apply -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
name: cluster
spec:
features:
disableCopiedCSVs: true
EOF
// OLMConfig - `disableCopiedCSVs: true`
NSX KLAB2/tats-test ~ $ oc get OLMConfig cluster -o yaml
apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"operators.coreos.com/v1","kind":"OLMConfig","metadata":{"annotations":{},"name":"cluster"},"spec":{"features":{"disableCopiedCSVs":true}}}
release.openshift.io/create-only: "true"
creationTimestamp: "2023-02-01T20:59:41Z"
generation: 2
name: cluster
ownerReferences:
- apiVersion: config.openshift.io/v1
kind: ClusterVersion
name: version
uid: 4a402525-1447-400f-858e-d05940feef96
resourceVersion: "442952687"
uid: d8740def-bf6a-4a2c-bec5-9021853d44eb
spec:
features:
disableCopiedCSVs: true
status:
conditions:
- lastTransitionTime: "2023-02-01T21:00:25Z"
message: Copied CSVs are disabled and at least one copied CSV was found for an
operator installed in AllNamespace mode
reason: CopiedCSVsFound
status: "False"
type: DisabledCopiedCSVs
// After a few minutes, All copied csvs are gone.
NSX KLAB2/tats-test ~ $ oc get csv
No resources found in tats-test namespace.
// Create a new project
KLAB/tats-test ~ $ oc new-project tats-dev
Now using project "tats-dev" on server "https://api.klab.devops.gov.bc.ca:6443".
You can add applications to this project with the 'new-app' command. For example, try:
oc new-app rails-postgresql-example
to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:
kubectl create deployment hello-node --image=k8s.gcr.io/e2e-test-images/agnhost:2.33 -- /agnhost serve-hostname
// Copied CSVs are created in the new namespace.
KLAB/tats-dev ~ $ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
custom-metrics-autoscaler.v2.8.2-174 Custom Metrics Autoscaler 2.8.2-174 custom-metrics-autoscaler.v2.7.1 Succeeded
devworkspace-operator.v0.19.1-0.1679521112.p DevWorkspace Operator 0.19.1+0.1679521112.p devworkspace-operator.v0.18.1-0.1675929565.p Succeeded
eap-operator.v2.3.10 JBoss EAP 2.3.10 eap-operator.v2.3.9 Succeeded
elasticsearch-operator.5.5.5 OpenShift Elasticsearch Operator 5.5.5 elasticsearch-operator.5.5.4 Succeeded
must-gather-operator.v1.1.2 Must Gather Operator 1.1.2 must-gather-operator.v1.1.1 Succeeded
rhacs-operator.v3.74.2 Advanced Cluster Security for Kubernetes 3.74.2 rhacs-operator.v3.74.1 Succeeded
service-registry-operator.v2.1.4 Red Hat Integration - Service Registry Operator 2.1.4 service-registry-operator.v2.1.3 Succeeded
KLAB/tats-dev ~ $
// Current OMLconfig -- "CopiedCSVsEnabled"
KLAB/tats-dev ~ $ oc get OLMConfig cluster -o yaml
apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
release.openshift.io/create-only: "true"
creationTimestamp: "2022-08-31T20:23:05Z"
generation: 1
name: cluster
ownerReferences:
- apiVersion: config.openshift.io/v1
kind: ClusterVersion
name: version
uid: d576083b-6ab6-4947-b035-26db5a5abc31
resourceVersion: "1706897937"
uid: 91272420-b04d-4d66-a6be-2a2c6f676e15
status:
conditions:
- lastTransitionTime: "2022-08-31T20:25:41Z"
message: Copied CSVs are enabled and present across the cluster
reason: CopiedCSVsEnabled
status: "False"
type: DisabledCopiedCSVs
// Disable copied CSV
KLAB/tats-dev ~ $ oc apply -f - <<EOF
> apiVersion: operators.coreos.com/v1
> kind: OLMConfig
> metadata:
> name: cluster
> spec:
> features:
> disableCopiedCSVs: true
> EOF
Warning: resource olmconfigs/cluster is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by oc apply. oc apply should only be used on resources created declaratively by either oc create --save-config or oc apply. The missing annotation will be patched automatically.
olmconfig.operators.coreos.com/cluster configured
// New OMLconfig -- "DisabledCopiedCSVs"
KLAB/tats-dev ~ $ oc get OLMConfig cluster -o yaml
apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"operators.coreos.com/v1","kind":"OLMConfig","metadata":{"annotations":{},"name":"cluster"},"spec":{"features":{"disableCopiedCSVs":true}}}
release.openshift.io/create-only: "true"
creationTimestamp: "2022-08-31T20:23:05Z"
generation: 2
name: cluster
ownerReferences:
- apiVersion: config.openshift.io/v1
kind: ClusterVersion
name: version
uid: d576083b-6ab6-4947-b035-26db5a5abc31
resourceVersion: "1708553827"
uid: 91272420-b04d-4d66-a6be-2a2c6f676e15
spec:
features:
disableCopiedCSVs: true
status:
conditions:
- lastTransitionTime: "2022-08-31T20:25:41Z"
message: Copied CSVs are disabled and at least one copied CSV was found for an
operator installed in AllNamespace mode
reason: CopiedCSVsFound
status: "False"
type: DisabledCopiedCSVs
// After a few minutes, All copied csvs are gone.
KLAB/tats-dev ~ $ oc get csv
No resources found in tats-dev namespace.
Existed apps are not affected by this change.
For instance, the below grafana-operator
had installed in the openshift-bcgov-grafana-test
namespace directly before DisabledCopiedCSVs
. But it's still visible in the Installed Operators list (because it's been installed in this NS directly) as below after copied CSV got removed.
...and apps are up and running as before.
I will keep DisabledCopiedCSVs
clusters for a while and see if anyone screams.
Checked etcd dashboards on KLAB and KLAB2. DB size and Memory size have been decreased since Copied CSV is Disabled on both cluster.
Since these are relatively small clusters, this gap appears small, but when applied to larger clusters such as Silver, we may see much larger gaps in other metrics as well as DB size and memory usage.
FYI - These are CLAB and SILVER clusters' graphs in the same time period to compare. They still have copided CSV in all namespaces.
CLAB is disabled also for testing copied CSV in OCP4.12.
One customer in CLAB asked about copied CSV because the yare no longer have a visibility to their copied CSV. We'll wait to see what that customer says about the disabled CSV in CLAB. If they ok with it, we'll add this `` disabling copied CSV to the upgrade docs and roll it into that change.
Waiting for customer evaluation.
Disabling copied CSVs causes the operators to not show in the admin web ui for customers. This can make it more difficult to manage custom resources for these operators. Let's roll back the change in LAB and then close this ticket.
OK. Enabled(roll backed) copied CSVs in all Lab clusters.
oc apply -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OLMConfig
metadata:
name: cluster
spec:
features:
disableCopiedCSVs: false
EOF
copied CSVs are now back to each namespaces...
CLAB/openshift-config ~ $ oc get csv -n te1690b-test
NAME DISPLAY VERSION REPLACES PHASE
amqstreams.v2.3.0-3 AMQ Streams 2.3.0-3 amqstreams.v2.3.0-2 Succeeded
custom-metrics-autoscaler.v2.8.2-174 Custom Metrics Autoscaler 2.8.2-174 custom-metrics-autoscaler.v2.7.1 Succeeded
devworkspace-operator.v0.20.0 DevWorkspace Operator 0.20.0 devworkspace-operator.v0.19.1-0.1682321189.p Succeeded
eap-operator.v2.3.10 JBoss EAP 2.3.10 eap-operator.v2.3.9 Succeeded
elasticsearch-operator.v5.6.5 OpenShift Elasticsearch Operator 5.6.5 elasticsearch-operator.v5.6.4 Succeeded
must-gather-operator.v1.1.2 Must Gather Operator 1.1.2 must-gather-operator.v1.1.1 Succeeded
openshift-gitops-operator.v1.7.4 Red Hat OpenShift GitOps 1.7.4 openshift-gitops-operator.v1.7.3 Succeeded
red-hat-camel-k-operator.v1.10.0-0.1682325781.p Red Hat Integration - Camel K 1.10.0+0.1682325781.p red-hat-camel-k-operator.v1.8.2-0.1675913507.p Succeeded
rhacs-operator.v3.74.3 Advanced Cluster Security for Kubernetes 3.74.3 rhacs-operator.v3.74.2 Succeeded
service-registry-operator.v2.1.5 Red Hat Integration - Service Registry Operator 2.1.5 service-registry-operator.v2.1.4 Succeeded
web-terminal.v1.7.0-0.1682321121.p Web Terminal 1.7.0+0.1682321121.p web-terminal.v1.6.0 Succeeded
I will close this ticket.
FYI - Looks like copied CSVs will be on user's web console even though it's been disabled on OCP4.13.
When copied CSVs are disabled by a cluster administrator, the web console is modified to show copied CSVs from the openshift namespace in every namespace for regular users, even though the CSVs are not actually copied to every namespace. This allows regular users to still be able to view the details of these Operators in their namespaces and create custom resources (CRs) brought in by globally installed Operators.
Reopened this issue. We will try this again after the OCP 4.13 upgrade.
Has been disabled in Silver due to performance issues. Will re-attempt doing in all clusters with 4.13 upgrade in January.
Describe the issue https://docs.openshift.com/container-platform/4.10/release_notes/ocp-4-10-release-notes.html#ocp-4-10-copied-csvs
Discuss this with Matt and investigate if we should implement this in our clusters.
What is the Value/Impact? Improved performance of the API/ETCD in large clusters like Silver
What is the plan? How will this get completed? Read the docs, discuss implementation with Matt, test in LAB, document how to implement on PROD clusters during the upgrade
Identify any dependencies OCP 4.10 upgrade in CLAB
Definition of done CSVs disabled in lab and steps documented, or comment here why we wont use this feature.