Create DevWorkspace CR to trigger webhooks: oc apply -f https://github.com/devfile/devworkspace-operator/raw/main/samples/plain.yaml
Open terminal and begin calling conversion webhooks in a loop: while true; do oc get devworkspaces.v1alpha1.workspace.devfile.io; sleep 0.5; done
Approve update to Operator in OpenShift console (version v0.16.0-0.1666668361.p)
What did you expect to see?
Webhooks continue being served as operator deployment is rolled out to a new version.
What did you see instead? Under which circumstances?
Brief periods where conversion webhooks are unavailable during upgrade. While upgrade is in process, oc get command from reproducer logs
Error from server (InternalError): Internal error occurred: error resolving resource
and
Error from server: conversion webhook for workspace.devfile.io/v1alpha2, Kind=DevWorkspace failed: Post "https://devworkspace-controller-manager-service.openshift-operators.svc:443/convert?timeout=30s": x509: certificate signed by unknown authority (possibly because of "x509: ECDSA verification failure" while trying to verify candidate authority certificate "Red Hat, Inc.")
Bug Report
What did you do?
Upgrade OLM-installed operator that serves conversion webhooks. To reproduce:
oc apply -f https://github.com/devfile/devworkspace-operator/raw/main/samples/plain.yaml
while true; do oc get devworkspaces.v1alpha1.workspace.devfile.io; sleep 0.5; done
v0.16.0-0.1666668361.p
)What did you expect to see?
Webhooks continue being served as operator deployment is rolled out to a new version.
What did you see instead? Under which circumstances?
Brief periods where conversion webhooks are unavailable during upgrade. While upgrade is in process,
oc get
command from reproducer logsand
Operator CSV gets condition
Conversion webhooks breaking causes the cluster to register as unstable, which is a potential issue for monitoring.
Environment
operator-lifecycle-manager version:
OpenShift nightly -- 4.11.0-0.nightly-2022-11-08-222031
Kubernetes version information:
Kubernetes cluster kind: OpenShift
Possible Solution Potentially an issue around certificates attached to conversion webhooks as CRDs are updated?
Additional context Add any other context about the problem here.