Open hrobertson opened 1 year ago
This would be closed by v0.15.0 release which is coming in a week
Thanks @shubham-cmyk v0.15.0 should resolve the various bugs that were introduced by the v0.14.0 release, but my intent with this issue was to get a little bit more confidence that future releases won't be similarly broken, not due to bugs in code, but due to the wrong commit being released.
Please could you provide a little information about the release process and how it will be ensured that future releases will be taken from the correct commit, and documentation and operatorhub etc will be updated?
Thanks
We do have an auto-release mechanism. The volume name change commit was merged in the v0.14.0 by mistake since it was a breaking change it caused a problem. We didn't put the new label on the image after releasing the v0.14.0 because to merge the critical bugs that could arise within a week or so and fix that immediately.
From this mistake, we have learned we would change the tag and would move to v0.15.x after releasing the v0.15.0 where the critical bugs would be addressed immediately and if some breaking change merge won't be a big issue.
The stable
tag would also be added to the image from now. To make sure users don't run into an issue.
Big-Weekly release of the image might be a good way with specific tags v0.15.x.
Sorry for the inconvenience that has been caused @hrobertson. If you feel any other change/addition feel free to drop a comment
I still am not clear what happened because the 0.14.0 tag does not appear to have been pushed to quay until April. But that is not important if a clear release policy is in place for the future.
Since 0.15.0 it looks like you are continuously updating the v0.15.0 image tag to point to new images. This is not normally how a major.minor.patch image tag behaves. Additionally there is no 0.15.0 tag in this git repo.
What I would expect, based on what I see as a common practice in many other repos, is as follows:
A v0.15.0
tag is created in git. Now an automated CI process (GitHub Actions or something else) builds an image from that commit and pushes it to quay with the image tags v0.15.0
, v0.15
, latest
.
Then a new commit is pushed to master. No release is made yet.
Then the commit is tagged in git as v0.15.1
. Now the CI process builds an image from that commit and pushes it to quay with the image tags v0.15.1
, v0.15
, latest
.
Note that the image tag v0.15.0
does not get updated, nor does the v0.15.0
git tag. They should be immutable.
Additionally if 0.15 is not yet "ready" or is a pre-release, have it on it's own branch rather than master and don't tag it as latest
.
Finally, a pull request should be automatically raised at https://github.com/k8s-operatorhub/community-operators for stable releases, and https://ot-redis-operator.netlify.app/docs/release-history/ should be updated automatically for all releases.
Thanks
Also to note that fresh installations from OperatorHub appear to be non-functional due to this issue. I have redis-operator.v0.14.0
stuck in the installing state since the redis-operator
pod is crash looping:
W0713 16:23:49.036232 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.23.0/tools/cache/reflector.go:167: failed to list *v1beta1.RedisSentinel: redissentinels.redis.redis.opstreelabs.in is forbidden: User "system:serviceaccount:openshift-operators:redis-operator" cannot list resource "redissentinels" in API group "redis.redis.opstreelabs.in" at the cluster scope
E0713 16:23:49.036291 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.0/tools/cache/reflector.go:167: Failed to watch *v1beta1.RedisSentinel: failed to list *v1beta1.RedisSentinel: redissentinels.redis.redis.opstreelabs.in is forbidden: User "system:serviceaccount:openshift-operators:redis-operator" cannot list resource "redissentinels" in API group "redis.redis.opstreelabs.in" at the cluster scope
W0713 16:23:49.266930 1 reflector.go:324] pkg/mod/k8s.io/client-go@v0.23.0/tools/cache/reflector.go:167: failed to list *v1beta1.RedisReplication: redisreplications.redis.redis.opstreelabs.in is forbidden: User "system:serviceaccount:openshift-operators:redis-operator" cannot list resource "redisreplications" in API group "redis.redis.opstreelabs.in" at the cluster scope
E0713 16:23:49.267228 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.23.0/tools/cache/reflector.go:167: Failed to watch *v1beta1.RedisReplication: failed to list *v1beta1.RedisReplication: redisreplications.redis.redis.opstreelabs.in is forbidden: User "system:serviceaccount:openshift-operators:redis-operator" cannot list resource "redisreplications" in API group "redis.redis.opstreelabs.in" at the cluster scope
Those appear to be new resources in v0.15, and are not granted as part of the v0.14 installation process.
I was able to get past the above errors by defining an extra set of permissions:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: redis-operator-fix
rules:
- apiGroups:
- redis.redis.opstreelabs.in
resources:
- redissentinels
- redisreplications
verbs:
- get
- watch
- list
- update
- patch
- create
- delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: redis-operator-fix
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: redis-operator-fix
subjects:
- kind: ServiceAccount
name: redis-operator
namespace: openshift-operators
I think your cluster role was not updated. You should update it to the latest version of the helm of the specific operator you are using.
@hrobertson Your ideas seem good. I would follow these for that I have to change the current CI this would be done before any further release. The latest release we had is v0.15.0
Hi all, someone has to update https://github.com/k8s-operatorhub/community-operators/blob/main/operators/redis-operator/0.15.0/manifests/redis-operator.v0.15.0.clusterserviceversion.yaml#L195-L199 to add redissentinels
and redisreplications
otherwise a fresh install by OLM will fail indefinitely.
@iamabhishek-dubey Please check this.
also here because fresh install via OLM fails, please update https://github.com/k8s-operatorhub/community-operators/blob/main/operators/redis-operator/0.15.0/manifests/redis-operator.v0.15.0.clusterserviceversion.yaml#L195-L199
This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!
What version of redis operator are you using?
redis-operator version: 0.14.0
What did you do?
OLM updated redis-operator to 0.14.0 (https://github.com/k8s-operatorhub/community-operators/blob/main/operators/redis-operator/0.14.0/manifests/redis-operator.v0.14.0.clusterserviceversion.yaml#L257)
What did you expect to see?
Image quay.io/opstree/redis-operator:v0.14.0 should have been built from https://github.com/OT-CONTAINER-KIT/redis-operator/tree/v0.14.0 (e86884ead1005484bdb10fb30caf8f8acac2f89b) (February 13th)
What did you see instead? In the v0.14.0 image manifest label
com.azure.dev.image.build.sourceversion
the source sha is 5e8ac25180a309ccd1f55b379af545479fedeba4 (April 14th) which is not tagged.This commit includes ff6980f6bd8c1191778cc065b1d18f11f58383a7 which broke updates and has since been reverted. This caused the issue reported here https://github.com/OT-CONTAINER-KIT/redis-operator/issues/526#issuecomment-1597598386
Also:
What is the release process of this operator? Is none of this automated?! If releases are being cut manually what process is in place to ensure it is done correctly?
Thanks