litmuschaos / litmus-helm

Helm Charts for the Litmus Chaos Operator & CRDs
Apache License 2.0
47 stars 88 forks source link

Cannot upgrade from 3.5.0 to 3.7.0 #380

Open danhngo-lx opened 6 months ago

danhngo-lx commented 6 months ago

Hi, I'm trying to upgrade Litmus from 3.5.0 to 3.7.0 but having this error:

litmus-chaos-center-server-59ccf69588-rcfzb       0/1     CrashLoopBackOff   1 (5s ago)    17s

The logs of the server:

{"file":"/gql-server/server.go:43","func":"main.init.0","level":"info","msg":"go version: go1.20.14","time":"2024-05-20T06:47:50Z"}
{"file":"/gql-server/server.go:44","func":"main.init.0","level":"info","msg":"go os/arch: linux/amd64","time":"2024-05-20T06:47:50Z"}
{"file":"/gql-server/pkg/database/mongodb/init.go:109","func":"github.com/litmuschaos/litmus/chaoscenter/graphql/server/pkg/database/mongodb.MongoConnection","level":"info","msg":"connected to mongo","time":"2024-05-20T06:47:50Z"}
{"error":"(NamespaceExists) Collection already exists. NS: litmus.chaosInfrastructures","file":"/gql-server/pkg/database/mongodb/init.go:130","func":"github.com/litmuschaos/litmus/chaoscenter/graphql/server/pkg/database/mongodb.(*MongoClient).initAllCollection","level":"error","msg":"failed to create chaosInfrastructures collection","time":"2024-05-20T06:47:50Z"}
{"error":"(NamespaceExists) Collection already exists. NS: litmus.chaosExperiments","file":"/gql-server/pkg/database/mongodb/init.go:154","func":"github.com/litmuschaos/litmus/chaoscenter/graphql/server/pkg/database/mongodb.(*MongoClient).initAllCollection","level":"error","msg":"failed to create chaosExperiments collection","time":"2024-05-20T06:47:50Z"}
{"error":"(NamespaceExists) Collection already exists. NS: litmus.chaosExperimentRuns","file":"/gql-server/pkg/database/mongodb/init.go:178","func":"github.com/litmuschaos/litmus/chaoscenter/graphql/server/pkg/database/mongodb.(*MongoClient).initAllCollection","level":"error","msg":"failed to create chaosExperimentRuns collection","time":"2024-05-20T06:47:50Z"}
{"error":"(NamespaceExists) Collection already exists. NS: litmus.chaosHubs","file":"/gql-server/pkg/database/mongodb/init.go:196","func":"github.com/litmuschaos/litmus/chaoscenter/graphql/server/pkg/database/mongodb.(*MongoClient).initAllCollection","level":"error","msg":"failed to create chaosHubs collection","time":"2024-05-20T06:47:50Z"}
{"error":"(NamespaceExists) Collection already exists. NS: litmus.chaosProbes","file":"/gql-server/pkg/database/mongodb/init.go:275","func":"github.com/litmuschaos/litmus/chaoscenter/graphql/server/pkg/database/mongodb.(*MongoClient).initAllCollection","level":"error","msg":"failed to create chaosProbes collection","time":"2024-05-20T06:47:50Z"}
{"file":"/gql-server/server.go:103","func":"main.main","level":"fatal","msg":"control plane needs to be upgraded from version 3.5.0 to 3.7.0","time":"2024-05-20T06:47:50Z"}

I suspect the reason for this error is because I dind't enable upgradeAgent:

upgradeAgent:
  enabled: false

But if I change it to true, the upgrade-agent cp pod cannot start:

litmus-chaos-center-upgrade-agent-cp-2-69jxn      0/1     ImagePullBackOff   0             7m15s

It's because the image is missing:

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  8m7s                  default-scheduler  Successfully assigned litmus/litmus-chaos-center-upgrade-agent-cp-2-69jxn to aks-defaultpool-31517240-vmss00001o
  Normal   Pulling    6m26s (x4 over 8m7s)  kubelet            Pulling image "litmuschaos.docker.scarf.sh/litmuschaos/upgrade-agent-cp:3.7.0"
  Warning  Failed     6m25s (x4 over 8m2s)  kubelet            Failed to pull image "litmuschaos.docker.scarf.sh/litmuschaos/upgrade-agent-cp:3.7.0": rpc error: code = NotFound desc = failed to pull and unpack image "litmuschaos.docker.scarf.sh/litmuschaos/upgrade-agent-cp:3.7.0": failed to resolve reference "litmuschaos.docker.scarf.sh/litmuschaos/upgrade-agent-cp:3.7.0": litmuschaos.docker.scarf.sh/litmuschaos/upgrade-agent-cp:3.7.0: not found
  Warning  Failed     6m25s (x4 over 8m2s)  kubelet            Error: ErrImagePull
  Warning  Failed     6m10s (x6 over 8m1s)  kubelet            Error: ImagePullBackOff
  Normal   BackOff    3m5s (x19 over 8m1s)  kubelet            Back-off pulling image "litmuschaos.docker.scarf.sh/litmuschaos/upgrade-agent-cp:3.7.0"

How can I do the upgrade in this case? Any help is appreciated. Thank you!

sambonbonne commented 6 months ago

Hello, I have the same issue when upgrading from 3.6.0 to 3.7.0, the portal server crashes with the error control plane needs to be upgraded from version 3.6.0 to 3.7.0.

{"file":"/gql-server/server.go:43","func":"main.init.0","level":"info","msg":"go version: go1.20.14","time":"2024-05-28T08:18:58Z"}
{"file":"/gql-server/server.go:44","func":"main.init.0","level":"info","msg":"go os/arch: linux/arm64","time":"2024-05-28T08:18:58Z"}
{"file":"/gql-server/pkg/database/mongodb/init.go:109","func":"github.com/litmuschaos/litmus/chaoscenter/graphql/server/pkg/database/mongodb.MongoConnection","level":"info","msg":"connected to mongo","time":"2024-05-28T08:18:59Z"}
{"file":"/gql-server/server.go:103","func":"main.main","level":"fatal","msg":"control plane needs to be upgraded from version 3.6.0 to 3.7.0","time":"2024-05-28T08:18:59Z"}