operator-framework / operator-controller

Apache License 2.0
28 stars 47 forks source link

`ClusterExtension`'s condition `HasValidBundle` is `False` when `Resolved`, `Installed` and `Unpacked` are all `True` #989

Closed m1kola closed 4 days ago

m1kola commented 4 days ago

It looks like something is wrong with ClusterExtension's HasValidBundle condition. I noticed that in this test: https://github.com/operator-framework/operator-controller/blob/cab41aaa6764f9413a58d52efe80dca9b8bdf651/test/e2e/cluster_extension_install_test.go#L82-L94

HasValidBundle reports False with the following message while other conditions such as Resolved, Installed and Unpacked are all report True:

error fetching bundles: error fetching catalog "test-catalog" contents: error performing request: Get "https://catalogd-catalogserver.olmv1-system.svc/catalogs/test-catalog/all.json": dial tcp 10.96.194.165:443: connect: connection refused

This can be reproduced by adding spew.Dump(clusterExtension.Status.Conditions) before this line:

https://github.com/operator-framework/operator-controller/blob/cab41aaa6764f9413a58d52efe80dca9b8bdf651/test/e2e/cluster_extension_install_test.go#L85

You should see something like this:

([]v1.Condition) (len=8 cap=9) {
 (v1.Condition) &Condition{Type:Resolved,Status:True,ObservedGeneration:1,LastTransitionTime:2024-06-28 11:19:20 +0200 CEST,Reason:Success,Message:resolved to "docker-registry.operator-controller-e2e.svc.cluster.local:5000/bundles/registry-v1/prometheus-operator:v2.0.0",},
 (v1.Condition) &Condition{Type:Installed,Status:True,ObservedGeneration:1,LastTransitionTime:2024-06-28 11:19:31 +0200 CEST,Reason:Success,Message:Instantiated bundle clusterextension-c75hzgnn successfully,},
 (v1.Condition) &Condition{Type:HasValidBundle,Status:False,ObservedGeneration:1,LastTransitionTime:2024-06-28 11:19:20 +0200 CEST,Reason:ResolutionFailed,Message:error fetching bundles: error fetching catalog "test-catalog" contents: error performing request: Get "https://catalogd-catalogserver.olmv1-system.svc/catalogs/test-catalog/all.json": dial tcp 10.96.194.165:443: connect: connection refused,},
 (v1.Condition) &Condition{Type:Deprecated,Status:False,ObservedGeneration:1,LastTransitionTime:2024-06-28 11:19:20 +0200 CEST,Reason:Deprecated,Message:,},
 (v1.Condition) &Condition{Type:PackageDeprecated,Status:False,ObservedGeneration:1,LastTransitionTime:2024-06-28 11:19:20 +0200 CEST,Reason:Deprecated,Message:,},
 (v1.Condition) &Condition{Type:ChannelDeprecated,Status:False,ObservedGeneration:1,LastTransitionTime:2024-06-28 11:19:20 +0200 CEST,Reason:Deprecated,Message:,},
 (v1.Condition) &Condition{Type:BundleDeprecated,Status:False,ObservedGeneration:1,LastTransitionTime:2024-06-28 11:19:20 +0200 CEST,Reason:Deprecated,Message:,},
 (v1.Condition) &Condition{Type:Unpacked,Status:True,ObservedGeneration:1,LastTransitionTime:2024-06-28 11:19:20 +0200 CEST,Reason:UnpackSuccess,Message:unpack successful: ,}
}

Alternatively, modify the test with the following patch (apply on top of main which is currently at cab41aaa6764f9413a58d52efe80dca9b8bdf651):

diff --git a/test/e2e/cluster_extension_install_test.go b/test/e2e/cluster_extension_install_test.go
index 3308c03..23d31e1 100644
--- a/test/e2e/cluster_extension_install_test.go
+++ b/test/e2e/cluster_extension_install_test.go
@@ -117,6 +117,19 @@ func TestClusterExtensionInstallRegistry(t *testing.T) {
        assert.Contains(ct, cond.Message, "Instantiated bundle")
        assert.NotEmpty(ct, clusterExtension.Status.InstalledBundle)
    }, pollDuration, pollInterval)
+
+   t.Log("By eventually reporting valid bundle")
+   require.EventuallyWithT(t, func(ct *assert.CollectT) {
+       assert.NoError(ct, c.Get(context.Background(), types.NamespacedName{Name: clusterExtension.Name}, clusterExtension))
+       cond := apimeta.FindStatusCondition(clusterExtension.Status.Conditions, ocv1alpha1.TypeHasValidBundle)
+       if !assert.NotNil(ct, cond) {
+           return
+       }
+       assert.Equal(ct, metav1.ConditionTrue, cond.Status)
+       assert.Equal(ct, ocv1alpha1.ReasonSuccess, cond.Reason)
+       assert.Empty(ct, cond.Message)
+       assert.NotEmpty(ct, clusterExtension.Status.InstalledBundle)
+   }, pollDuration, pollInterval)
 }

 func TestClusterExtensionInstallReResolvesWhenNewCatalog(t *testing.T) {
m1kola commented 4 days ago

It looks like this is related to https://github.com/operator-framework/operator-controller/pull/846#discussion_r1604734413:

2. If we do need it - as far as I see we only set this condition to False in case of an error from r.Storage.Load(ctx, ext) and never set it to True.