project-zot / zot

zot - A scale-out production-ready vendor-neutral OCI-native container image/artifact registry (purely based on OCI Distribution Specification)
https://zotregistry.dev
Apache License 2.0
922 stars 98 forks source link

[Bug]: On environment variable update, the new pod of zot that gets created goes into crashloopbackoff #2733

Open kiransripada22 opened 5 days ago

kiransripada22 commented 5 days ago

zot version

v2.1.1

Describe the bug

Hi,

We have configured zot as a kubernetes deployment and added a flux-cd controller to track any changes that happen to this deployment and update the clusters based on the changes.

We used to have zot v1.4.3 which never had any issue with this rolling update whenever something changes in the zot deployment.

But we started facing issue once we upgraded to zot v2.1.1

After the upgrade, whenever we update any environment variable, instead of creating a new pod that replaces the old running pod, we are now getting a new pod that keeps going to crash loop and the old pod stays the same.

We have to manually go and delete the old pod for the crash loop to stop.

We checked the logs and the below is the log we found in zot container

{"level":"error","error":"timeout","goroutine":1,"caller":"zotregistry.dev/zot/pkg/cli/server/root.go:76","time":"2024-10-16T11:39:06.856027479Z","message":"failed to init controller"}

Error: timeout

To reproduce

  1. Install zot image as a deployment in kubernetes.
  2. Update any environment variable in the kubernetes pod.

Expected behavior

New pod of zot should be created that replaces the old pod.

Screenshots

No response

Additional context

No response

rchincha commented 5 days ago

@kiransripada22

v1.4.x -> v2.x.x is a major version upgrade path and we don't guarantee backward compatibility in this case.

However, that said, the best approach would be to setup a v2.x.x zot and setup sync/miror from v1.4.3 and then do the rolling upgrades thereafter.

andaaron commented 5 days ago

@kiransripada22, do you have anything specific in the configuration which is shared between the zot instances? Maybe shared storage? Are you using zot or zot-minimal? Do you have any specific extensions enabled? Do you use authentication?

kiransripada22 commented 4 days ago

@rchincha Sorry if i am not clear, but i am facing this issue with a fresh installation of zot V2.1.1 . I had a complete new installation of zot V 2.1.1 and in that cluster when we did a rolling update, we are facing an issue where it fails to init controller.

@andaaron

  1. I am using zot-minimal.
  2. I have auth enabled with htpasswd
  3. I had sync enabled with another container registry
  4. This is a rolling update scenario. So i think storage is same.

Note: Also I found that when we do kubernetes update with Recreate Deployment Strategy it works, but the scenario we are using needs rolling update

rchincha commented 4 days ago

@kiransripada22

Wondering if you need this: https://github.com/project-zot/zot/pull/2730

kiransripada22 commented 2 days ago

@rchincha I think that may not fully fix it because if we delete the existing pod, the controller seems to not have any issue initialising. So this could be a resource availability issue when we do the rolling update