argoproj-labs / argocd-operator

A Kubernetes operator for managing Argo CD clusters.
https://argocd-operator.readthedocs.io
Apache License 2.0
623 stars 700 forks source link

Operator version 0.9.0: operator image not available, Github release missing #1295

Open zetamorph opened 5 months ago

zetamorph commented 5 months ago

Hi,

it seems that there is a new version, 0.9.0, of the ArgoCD operator on OperatorHub.

However, operatorhub.io is still showing 0.8.0 as the latest release. On Github, there isn't a new release, either.

Our OpenShift cluster sees the 0.9.0 version as latest, though. When it tries to install it, the installation fails because the operator image on quay.io is missing.

The ClusterServiceVersion sets the image quay.io/argoprojlabs/argocd-operator@sha256:38d61c3acda6230525614d1e609d0955f944680ccf12c07c1c84a7e5b0b98de8.

CSV:

apiVersion: operators.coreos.com/v1alpha1
kind: ClusterServiceVersion
metadata:
  annotations:
    operators.operatorframework.io/builder: operator-sdk-v1.10.0+git
    operators.operatorframework.io/project_layout: go.kubebuilder.io/v3
    certified: 'false'
    olm.targetNamespaces: ''
    [...]
    capabilities: Deep Insights
    olm.operatorNamespace: openshift-operators
    containerImage: >-
      quay.io/argoprojlabs/argocd-operator@sha256:38d61c3acda6230525614d1e609d0955f944680ccf12c07c1c84a7e5b0b98de8
    categories: Integration & Delivery
    description: 'Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes.'
    olm.operatorGroup: global-operators
  name: argocd-operator.v0.9.0

This image does not exist on quay.io. Maybe something went wrong during the OperatorHub release?

Screenshots

Version 0.9.0 being available via OperatorHub from the POV of our OpenShift cluster:

Bildschirmfoto 2024-04-03 um 14 40 26

bo0ts commented 5 months ago

Can confirm. Looking at https://quay.io/repository/argoprojlabs/argocd-operator?tab=history&tag=latest v0.9.0 was overwritten

v0.9.0 was moved to SHA256[ 25c71c9f0fbc ](https://quay.io/repository/argoprojlabs/argocd-operator/manifest/sha256:25c71c9f0fbc5203b3ea8021cb770a98642ddf3ee3cc85faac3be642b7183491) from SHA256 38d61c3acda6

The old manifest must have been deleted: https://quay.io/repository/argoprojlabs/argocd-operator/manifest/sha256:38d61c3acda6230525614d1e609d0955f944680ccf12c07c1c84a7e5b0b98de8

peterroth commented 5 months ago

Unfortunately, we had automatic update enabled on our OKD cluster and the process failed. The service is up and running, but 0.8.0 stuck in "Replacing" status while 0.9.0 cannot start up and logs an error message continuously:

9 times in the last 0 minutes
install strategy failed: Deployment.apps "argocd-operator-controller-manager" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"control-plane":"argocd-operator"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

I was able to reproduce it just now, running ArgoCD on a Single Node OKD environment. This might be the reason why the new version is deleted on operatorhub.io.

Elyytscha commented 5 months ago

had the same problem, my okd auto upgraded to 0.9.0 and i had a 'jiatan75' moment where I thought about whats going on, on operatorhub its still 0.8.0, normally operatorhub updates faster then the redhat community operator catalog, so i investigated.

this commit should have fixed it https://github.com/argoproj-labs/argocd-operator/pull/1296/files#diff-01184a8cb2250baf03253df4c205a04a94228e5db47e1cbfd7e9edf44182a65a

but https://github.com/redhat-openshift-ecosystem/community-operators-prod/pull/4321#issuecomment-2035333231 said nop because of

The PR modifies existing bundles: ['argocd-operator/0.9.0'] (https://gist.github.com/rh-operator-bundle-bot/86fa0757b79b22cffbfd9003c7b97bc8)

so I think what needs to be done is to release a 0.9.1 csv over here in the argocd operator repo then update or create a new pr with 0.9.1 in community-operators-prod repo

bo0ts commented 5 months ago

@reginapizza Any chance for an update soon?

reginapizza commented 5 months ago

hello! Yep currently trying to fix this, but I wanted to verify everything with my teammate first... unfortunately he's on PTO until wednesday. In our release document, it didn't explicitly state that the community operators release PR should be merged before the RH operators PR, so I made both PR's at the same time, and it was caught in the community-operators merge checks that the labels are wrong (due to an automatic change with some control-plane labels that is made every time we run make bundle), but the RH operators PR had already done an auto-merge. We don't usually do z-stream releases (like 0.9.1 would be), although we have already been considering starting to in order to keep up with argo cd patch updates. Once my teammate is back on wednesday I'll work with him to sort out this whole situation, but if there is a more dire need to have it done before hand please let me know, and sorry for all the confusion!

Elyytscha commented 5 months ago

although we have already been considering starting to in order to keep up with argo cd patch updates.

Just to note, I ran the entire process yesterday on my private okd https://argocd-operator.readthedocs.io/en/latest/release-process/

and it would be perfectly fine with patch level versioned csv's image

For now only alpha channel for the operator is published

it would be maybe a good idea creating new channel(s) where patch level versioned csv's are possible would also make this obsolote, as far as I tested, this change is only needed for successfully upgrading the operator, a new install works without the change in the control-plane label

maybe like this

# https://github.com/argoproj-labs/argocd-operator/blob/c7c457dea032941b3c13d86db5e8045d62d8c6a3/deploy/olm-catalog/argocd-operator/argocd-operator.package.yaml
channels:
- currentCSV: argocd-operator.v0.7.0
  name: alpha
- currentCSV: argocd-operator.v0.9.1
  name: beta
defaultChannel: alpha
packageName: argocd-operator

(imo the best channel schema at final state would be something like openshift does)

channels:
- currentCSV: argocd-operator.v0.9.1
  name: stable
- currentCSV: argocd-operator.v0.9.4
  name: fast
- currentCSV: argocd-operator.v0.10.5
  name: candidate

defaultChannel: alpha
packageName: argocd-operator

but if there is a more dire need to have it done before hand please let me know, and sorry for all the confusion!

No Problem, I was just curious, so I investigated. I pinned my operator to 0.8.0 and then upgraded to my self built 0.9.2, so for me there is no dire need to have it done before hand :)

stanbog commented 4 months ago

Hello,

Do we have an update on this ?

Thank you.

reginapizza commented 4 months ago

hey! I released 0.9.0 and also 0.9.1 yesterday. v0.9.1 is the same as 0.9.0 except I've fixed the control plane issue and bumped the argocd version to 2.9.11. I have a PR open here to add it to the red hat community operators, but there's some issue with preflight in the CI checks that causes a merge block. Currently trying to get an ETA on when that will be fixed so the PR can auto-merge, but hopefully it should be available soon!

bo0ts commented 4 months ago

@reginapizza Looks like the PR is approved but not merged yet?

reginapizza commented 4 months ago

@bo0ts hey! Yeah unfortunately the issue with pre-flight is not fixed yet. If you see most of the other open PR's the majority of them are also failing the same way. I've been keeping up with the people working on fixing it on slack and waiting for updates on when it should be fixed, and in the meantime continuing to try the pipelines in case there's some chance that it could pass :/ I'm sorry that it's not fixed yet, I'm trying everything I can to get it merged asap and I'll post here as soon as I see it's merged!

GeroL commented 4 months ago

The image v0.9.1 was pushed successfully:

[build-bundle : build] Successfully tagged quay.io/community-operator-pipeline-prod/argocd-operator:0.9.1-5cff466ceb3d7977e87f0cae0c751f219c6f4b40
[build-bundle : build] 794fdd3b2e06703c3d02727c008940ad936d0751fc8cf19d8c893788e22433c0
[build-bundle : build] Pushing quay.io/community-operator-pipeline-prod/argocd-operator:0.9.1-5cff466ceb3d7977e87f0cae0c751f219c6f4b40
[build-bundle : build] Getting image source signatures
[build-bundle : build] Copying blob sha256:5c3fd3a19596468dda16f64b15bd4650f4938489aa6c63d5dcf51aa888d03e27
[build-bundle : build] Copying config sha256:794fdd3b2e06703c3d02727c008940ad936d0751fc8cf19d8c893788e22433c0
[build-bundle : build] Writing manifest to image destination
[build-bundle : build] Storing signatures
[build-bundle : build] sha256:8848f3189af43a46c2167488f485e2f589585ce8ca0364379d116a5d73caf9ba
reginapizza commented 4 months ago

yep! Just saw that the PR was finally merged, so 0.9.1 should be available now. If you find it acceptable, please mark the issue as closed!

stanbog commented 4 months ago

Hello,

Can confirm that the issue is fixed for us as well.

Thank you for all the efforts !

jonasbartho commented 4 months ago

Hi @reginapizza  

hmm, in my OKD4-staging environment our ArgoCD operator is not able to see any newer versions than v0.8.0

We noticed in the ArgoCD-jobs under namespace openshift-marketplace that it wants to pull from quay.io/openshift-community-operators/argocd-operator@sha256:0...... while it actually should pull from quay.io/community-operator-pipeline-prod/argocd-operator@sha256:6 ....

Any tips here? I have tried reinstalling the operator(with first removing the sub,csv and relevant job/cm under namespace openshift-marketplace) without any luck.. :)

jonasbartho commented 4 months ago

Hi @reginapizza  

hmm, in my OKD4-staging environment our ArgoCD operator is not able to see any newer versions than v0.8.0

We noticed in the ArgoCD-jobs under namespace openshift-marketplace that it wants to pull from quay.io/openshift-community-operators/argocd-operator@sha256:0...... while it actually should pull from quay.io/community-operator-pipeline-prod/argocd-operator@sha256:6 ....

Any tips here? I have tried reinstalling the operator(with first removing the sub,csv and relevant job/cm under namespace openshift-marketplace) without any luck.. :)

This was due to a bug in OKD4. Pruning alle the community-operator images with crictl resolved the issue.

jwklijnsma commented 1 month ago

is there any solution to fix this with reinstall the operator enzo ?