strangelove-ventures / cosmos-operator

Cosmos Operator is a kubernetes operator for managing cosmos nodes
Apache License 2.0
75 stars 18 forks source link

ScheduledVolumeSnapshot says VolumeSnapshot CRDs are not installed when they are #389

Closed gonzalomarcote closed 9 months ago

gonzalomarcote commented 9 months ago

When creating one ScheduledVolumeSnapshot I can see it is complaining about missing snapshot CDRs:

kubectl get scheduledvolumesnapshots.cosmos.strange.love 
NAME                                             PHASE         AGE
evm-sidechain-testnet-scheduledvolumesnapshots   MissingCRDs   64m

The cosmos-operator-system logs say they are not installed:

2023-11-15T15:13:44Z    error   Controller is disabled  {"controller": "scheduledvolumesnapshot", "controllerGroup": "cosmos.strange.love", "controllerKind": "ScheduledVolumeSnapshot", "ScheduledVolumeSnapshot": {"name":"evm-sidechain-testnet-scheduledvolumesnapshots","namespace":"default"}, "namespace": "default", "name": "evm-sidechain-testnet-scheduledvolumesnapshots", "reconcileID": "0d2c758f-98a2-4665-9926-918a7692da79", "error": "cluster does not have VolumeSnapshot CRDs installed"}

When those CRDs are installed in our cluster:

kubectl api-resources | grep volumesnapshot
scheduledvolumesnapshots                              cosmos.strange.love/v1alpha1           true         ScheduledVolumeSnapshot
volumesnapshotclasses             vsclass,vsclasses   snapshot.storage.k8s.io/v1             false        VolumeSnapshotClass
volumesnapshotcontents            vsc,vscs            snapshot.storage.k8s.io/v1             false        VolumeSnapshotContent
volumesnapshots                   vs                  snapshot.storage.k8s.io/v1             true         VolumeSnapshot

kubectl get crd | grep -i snapshot
scheduledvolumesnapshots.cosmos.strange.love                2023-10-04T13:52:15Z
volumesnapshotclasses.snapshot.storage.k8s.io               2023-11-15T15:12:21Z
volumesnapshotcontents.snapshot.storage.k8s.io              2023-11-15T15:12:29Z
volumesnapshots.snapshot.storage.k8s.io                     2023-11-15T15:12:36Z

Could be this CRDs should be present in the cluster before creating or deploying CosmosFullNode? Or what could be causing this issue?

agouin commented 9 months ago

Can you try deleting the operator pod(s)? Deleting them will cause them to restart, and it should detect the CRDs. The CRDs are detected on operator startup.

gonzalomarcote commented 9 months ago

That was exactly the issue. Restarting cosmos-operator-controller-manager pods made them to detect recently installed snapshots CRDs.

2023-11-16T10:00:20Z    info    Requeuing for next snapshot {"controller": "scheduledvolumesnapshot", "controllerGroup": "cosmos.strange.love", "controllerKind": "ScheduledVolumeSnapshot", "ScheduledVolumeSnapshot": {"name":"evm-sidechain-testnet-scheduledvolumesnapshots","namespace":"default"}, "namespace": "default", "name": "evm-sidechain-testnet-scheduledvolumesnapshots", "reconcileID": "b9d9e937-0657-422a-942c-0e8e02e34ef8", "duration": "14m39.235053464s"}

Thanks. You can close this issue

hakuno2000 commented 2 months ago

Can you try deleting the operator pod(s)? Deleting them will cause them to restart, and it should detect the CRDs. The CRDs are detected on operator startup.

Thanks for your suggestion, I tried this and also recreated scheduledvolumesnapshot, it's working normally now