Closed steven-zou closed 3 years ago
one day later, get more replicaset:
replicaset.apps/sample-harbor-registryctl-54fc7f564 0 0 0 2d2h
replicaset.apps/sample-harbor-registryctl-5656d8d6f4 0 0 0 2d2h
replicaset.apps/sample-harbor-registryctl-647f499c85 0 0 0 31h
replicaset.apps/sample-harbor-registryctl-64b77c84fc 0 0 0 8h
replicaset.apps/sample-harbor-registryctl-678597bc6b 1 1 0 112m
replicaset.apps/sample-harbor-registryctl-687bc446c9 0 0 0 27h
replicaset.apps/sample-harbor-registryctl-6b8ccdcb9 1 1 0 163m
replicaset.apps/sample-harbor-registryctl-6dbc6ff85b 0 0 0 21h
replicaset.apps/sample-harbor-registryctl-6fb4fdd49 0 0 0 2d2h
replicaset.apps/sample-harbor-registryctl-745d586f66 0 0 0 21h
replicaset.apps/sample-harbor-registryctl-75d69f5d9f 0 0 0 12h
replicaset.apps/sample-harbor-registryctl-77485bd8 0 0 0 18h
replicaset.apps/sample-harbor-registryctl-775b7cdb86 0 0 0 163m
replicaset.apps/sample-harbor-registryctl-78cfc44d9 0 0 0 27h
replicaset.apps/sample-harbor-registryctl-7b9f4bcf86 0 0 0 11h
replicaset.apps/sample-harbor-registryctl-7bbddf4c58 0 0 0 21h
replicaset.apps/sample-harbor-registryctl-7bcdc65d99 0 0 0 18h
replicaset.apps/sample-harbor-registryctl-7bdb4f9b6 0 0 0 31h
replicaset.apps/sample-harbor-registryctl-7df56bc846 0 0 0 12h
replicaset.apps/sample-harbor-registryctl-9446d65 0 0 0 7h47m
replicaset.apps/sample-harbor-registryctl-997d8b5b4 0 0 0 29h
~$ kubectl rollout history deployment.apps/sample-harbor-registryctl
deployment.apps/sample-harbor-registryctl
REVISION CHANGE-CAUSE
1 <none>
2 <none>
3 <none>
4 <none>
5 <none>
6 <none>
7 <none>
8 <none>
9 <none>
10 <none>
11 <none>
12 <none>
13 <none>
14 <none>
15 <none>
16 <none>
17 <none>
18 <none>
19 <none>
20 <none>
21 <none>
22 <none>
Declare the new state of the Pods by updating the PodTemplateSpec of the Deployment. A new ReplicaSet is created and the Deployment manages moving the Pods from the old ReplicaSet to the new one at a controlled rate. Each new ReplicaSet updates the revision of the Deployment.
Get the rollout history details by command kubectl rollout history deployment.apps/sample-harbor-registryctl --revision=xx
and do comparisons, found the main changes are from the label and annotations which are related to checksum value:
revision=1
Pod Template:
Labels:
pod-template-hash=54fc7f564
Annotations:
sample-harbor.default.registry.registryctl.goharbor.io/version: 4860044
revision=22
Pod Template:
Labels:
pod-template-hash=5fb75b98db
Annotations:
sample-harbor.default.registry.registryctl.goharbor.io/version: 5640281
@holyhope
The annotation sample-harbor.default.registry.registryctl.goharbor.io/version
for registryctl pod is very different from other component pods, the root cause may be the changes to this annotation. I check the code and did not find the concrete code to set such annotation value. Could u please provide some clues?
For example, similar annotations of the core deployment:
sample-harbor-core.default.secret.core.goharbor.io/version: "4859926"
sample-harbor-core.default.configmap.core.goharbor.io/version: "4859925"
It seems the registry
is a dependent resource of registryctl
.
The number of replica of registryctl is still increasing:
NAME DESIRED CURRENT READY AGE
replicaset.apps/sample-harbor-core-76f967d77f 1 1 0 3d2h
replicaset.apps/sample-harbor-jobservice-6b4f89bb96 1 1 0 3d2h
replicaset.apps/sample-harbor-jobservice-79b486c54d 1 1 0 3d2h
replicaset.apps/sample-harbor-portal-bbc6c9 1 1 0 3d2h
replicaset.apps/sample-harbor-registry-688fc75c75 1 1 0 3d2h
replicaset.apps/sample-harbor-registryctl-54fc7f564 0 0 0 3d2h
replicaset.apps/sample-harbor-registryctl-5656d8d6f4 0 0 0 3d2h
replicaset.apps/sample-harbor-registryctl-5b67d698b7 1 1 0 5h35m
replicaset.apps/sample-harbor-registryctl-5fb75b98db 0 0 0 23h
replicaset.apps/sample-harbor-registryctl-5fff5cfc69 0 0 0 10h
replicaset.apps/sample-harbor-registryctl-647f499c85 0 0 0 2d7h
replicaset.apps/sample-harbor-registryctl-64b77c84fc 0 0 0 32h
replicaset.apps/sample-harbor-registryctl-64ff4599b6 0 0 0 17h
replicaset.apps/sample-harbor-registryctl-678597bc6b 0 0 0 25h
replicaset.apps/sample-harbor-registryctl-687bc446c9 0 0 0 2d3h
replicaset.apps/sample-harbor-registryctl-68dfc69756 1 1 0 6h25m
replicaset.apps/sample-harbor-registryctl-6b4965dbbc 0 0 0 21h
replicaset.apps/sample-harbor-registryctl-6b8ccdcb9 0 0 0 26h
replicaset.apps/sample-harbor-registryctl-6dbc6ff85b 0 0 0 45h
replicaset.apps/sample-harbor-registryctl-6fb4fdd49 0 0 0 3d2h
replicaset.apps/sample-harbor-registryctl-745d586f66 0 0 0 45h
replicaset.apps/sample-harbor-registryctl-75d69f5d9f 0 0 0 36h
replicaset.apps/sample-harbor-registryctl-75dd5f9b94 0 0 0 10h
replicaset.apps/sample-harbor-registryctl-76767cb449 0 0 0 10h
replicaset.apps/sample-harbor-registryctl-77485bd8 0 0 0 42h
replicaset.apps/sample-harbor-registryctl-775b7cdb86 0 0 0 26h
replicaset.apps/sample-harbor-registryctl-777cfd4b9c 0 0 0 16h
replicaset.apps/sample-harbor-registryctl-78cfc44d9 0 0 0 2d3h
replicaset.apps/sample-harbor-registryctl-7b9f4bcf86 0 0 0 35h
replicaset.apps/sample-harbor-registryctl-7bbddf4c58 0 0 0 45h
replicaset.apps/sample-harbor-registryctl-7bcdc65d99 0 0 0 42h
replicaset.apps/sample-harbor-registryctl-7bdb4f9b6 0 0 0 2d7h
replicaset.apps/sample-harbor-registryctl-7cd5b9bbd4 0 0 0 7h35m
replicaset.apps/sample-harbor-registryctl-7df56bc846 0 0 0 36h
replicaset.apps/sample-harbor-registryctl-85fd447d8b 0 0 0 14h
replicaset.apps/sample-harbor-registryctl-8666688874 0 0 0 17h
replicaset.apps/sample-harbor-registryctl-9446d65 0 0 0 31h
replicaset.apps/sample-harbor-registryctl-997d8b5b4 0 0 0 2d5h
replicaset.apps/sample-harbor-registryctl-c569ddc64 0 0 0 10h
replicaset.apps/sample-harbor-registryctl-cbb96c8bc 0 0 0 7h35m
The desired pods are not successfully created:
pod/sample-harbor-core-76f967d77f-vzlvq 0/1 ContainerCreating 0 3d2h
pod/sample-harbor-jobservice-6b4f89bb96-q7t77 0/1 ContainerCreating 0 3d2h
pod/sample-harbor-jobservice-79b486c54d-rb88q 0/1 ContainerCreating 0 3d2h
pod/sample-harbor-portal-bbc6c9-n7p5b 0/1 ContainerCreating 0 3d2h
pod/sample-harbor-registry-688fc75c75-wtrrh 0/1 ContainerCreating 0 3d2h
pod/sample-harbor-registryctl-5b67d698b7-xwc2x 0/1 ContainerCreating 0 5h35m
pod/sample-harbor-registryctl-68dfc69756-689lm 0/1 ContainerCreating 0 6h25m
I see that issue on my side too I am working on test suite with better scenari and increased coverage
Find some logs
2020-10-30T09:54:41.400Z ERROR controller-runtime.controller Reconciler error {"controller": "registrycontroller", "request": "default/sample-harbor", "error": "cannot set status to error: cannot set conditions to error: apply apps/v1, Kind=Deployment (default/sample-harbor-registryctl): check: cannot get apps/v1, Kind=Deployment default/sample-harbor-registryctl: Deployment.apps \"sample-harbor-registryctl\" not found: apply apps/v1, Kind=Deployment (default/sample-harbor-registryctl): check: cannot get apps/v1, Kind=Deployment default/sample-harbor-registryctl: Deployment.apps \"sample-harbor-registryctl\" not found", "errorVerbose": "Deployment.apps \"sample-harbor-registryctl\" not found\ncannot get apps/v1, Kind=Deployment default/sample-harbor-registryctl\ngithub.com/goharbor/harbor-operator/pkg/controller.(Controller).ensureResourceReady\n\t/home/steven/code/harbor-operator/pkg/controller/ready.go:36\ngithub.com/goharbor/harbor-operator/pkg/controller.(Controller).applyAndCheck\n\t/home/steven/code/harbor-operator/pkg/controller/common.go:138\ngithub.com/goharbor/harbor-operator/pkg/controller.(Controller).ProcessFunc.func1\n\t/home/steven/code/harbor-operator/pkg/controller/resource.go:115\ngithub.com/goharbor/harbor-operator/pkg/graph.(resourceManager).Run.func1\n\t/home/steven/code/harbor-operator/pkg/graph/runner.go:42\ngolang.org/x/sync/errgroup.(Group).Go.func1\n\t/home/steven/code/harbor-operator/vendor/golang.org/x/sync/errgroup/errgroup.go:57\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1373\ncheck\ngithub.com/goharbor/harbor-operator/pkg/controller.(Controller).applyAndCheck\n\t/home/steven/code/harbor-operator/pkg/controller/common.go:140\ngithub.com/goharbor/harbor-operator/pkg/controller.(Controller).ProcessFunc.func1\n\t/home/steven/code/harbor-operator/pkg/controller/resource.go:115\ngithub.com/goharbor/harbor-operator/pkg/graph.(resourceManager).Run.func1\n\t/home/steven/code/harbor-operator/pkg/graph/runner.go:42\ngolang.org/x/sync/errgroup.(Group).Go.func1\n\t/home/steven/code/harbor-operator/vendor/golang.org/x/sync/errgroup/errgroup.go:57\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1373\napply apps/v1, Kind=Deployment (default/sample-harbor-registryctl)\ngithub.com/goharbor/harbor-operator/pkg/controller.(Controller).ProcessFunc.func1\n\t/home/steven/code/harbor-operator/pkg/controller/resource.go:117\ngithub.com/goharbor/harbor-operator/pkg/graph.(resourceManager).Run.func1\n\t/home/steven/code/harbor-operator/pkg/graph/runner.go:42\ngolang.org/x/sync/errgroup.(Group).Go.func1\n\t/home/steven/code/harbor-operator/vendor/golang.org/x/sync/errgroup/errgroup.go:57\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1373\ncannot set status to error: cannot set conditions to error: apply apps/v1, Kind=Deployment (default/sample-harbor-registryctl): check: cannot get apps/v1, Kind=Deployment default/sample-harbor-registryctl: Deployment.apps \"sample-harbor-registryctl\" not found\ngithub.com/goharbor/harbor-operator/pkg/controller.(Controller).HandleError\n\t/home/steven/code/harbor-operator/pkg/controller/errors.go:50\ngithub.com/goharbor/harbor-operator/pkg/controller.(Controller).Reconcile\n\t/home/steven/code/harbor-operator/pkg/controller/common.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler\n\t/home/steven/code/harbor-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:245\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem\n\t/home/steven/code/harbor-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:221\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).worker\n\t/home/steven/code/harbor-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:200\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1 github.com/go-logr/zapr.(zapLogger).Error /home/steven/code/harbor-operator/vendor/github.com/go-logr/zapr/zapr.go:128 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler /home/steven/code/harbor-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:247 sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem /home/steven/code/harbor-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:221 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker /home/steven/code/harbor-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:200 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1 /home/steven/code/harbor-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 k8s.io/apimachinery/pkg/util/wait.BackoffUntil /home/steven/code/harbor-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 k8s.io/apimachinery/pkg/util/wait.JitterUntil /home/steven/code/harbor-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 k8s.io/apimachinery/pkg/util/wait.Until /home/steven/code/harbor-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90
@holyhope
Do u have any clue about this issue? And do you know where this annotation sample-harbor.default.registry.registryctl.goharbor.io/version: 5640281
is set?
For rs, each label/annotation change will cause a new rs created.
For the logs posted above, it seems there some outdated changes applied to registrycontroller
component.
@glitchcrab
Any progress on this issue?
Hi Steven, I did some digging yesterday, but I'm still not sure what is the root cause of this issue. It seems related to the internal certificate that cannot be mounted. Here is what I saw in the Kubernetes events:
31s Normal Scheduled pod/sample-harbor-registryctl-656677c-rkdss Successfully assigned default/sample-harbor-registryctl-656677c-rkdss to node-67c9b53b-8e68-4f3d-976a-7499a466ca1f
32s Normal SuccessfulCreate replicaset/sample-harbor-registryctl-656677c Created pod: sample-harbor-registryctl-656677c-rkdss
30s Normal Scheduled pod/sample-harbor-registryctl-565485c695-c5l6t Successfully assigned default/sample-harbor-registryctl-565485c695-c5l6t to node-67c9b53b-8e68-4f3d-976a-7499a466ca1f
31s Normal ScalingReplicaSet deployment/sample-harbor-registryctl Scaled up replica set sample-harbor-registryctl-565485c695 to 1
31s Normal SuccessfulCreate replicaset/sample-harbor-registryctl-565485c695 Created pod: sample-harbor-registryctl-565485c695-c5l6t
32s Normal ScalingReplicaSet deployment/sample-harbor-registryctl Scaled up replica set sample-harbor-registryctl-656677c to 1
30s Warning FailedMount pod/sample-harbor-registryctl-656677c-rkdss MountVolume.SetUp failed for volume "internal-certificates" : failed to sync secret cache: timed out waiting for the condition
15s Normal Pulling pod/sample-harbor-registryctl-565485c695-c5l6t Pulling image "goharbor/harbor-registryctl:v2.0.0"
15s Normal Pulling pod/sample-harbor-registryctl-656677c-rkdss Pulling image "goharbor/harbor-registryctl:v2.0.0"
13s Normal Created pod/sample-harbor-registryctl-656677c-rkdss Created container registryctl
13s Normal Pulled pod/sample-harbor-registryctl-656677c-rkdss Successfully pulled image "goharbor/harbor-registryctl:v2.0.0" in 2.436909146s
12s Normal Pulled pod/sample-harbor-registryctl-565485c695-c5l6t Successfully pulled image "goharbor/harbor-registryctl:v2.0.0" in 3.452909536s
12s Normal Started pod/sample-harbor-registryctl-656677c-rkdss Started container registryctl
11s Normal Started pod/sample-harbor-registryctl-565485c695-c5l6t Started container registryctl
11s Normal Created pod/sample-harbor-registryctl-565485c695-c5l6t Created container registryctl
6s Normal ScalingReplicaSet deployment/sample-harbor-registryctl Scaled up replica set sample-harbor-registryctl-5b78575b9c to 1
6s Normal SuccessfulCreate replicaset/sample-harbor-registryctl-5b78575b9c Created pod: sample-harbor-registryctl-5b78575b9c-jbwpr
5s Normal Scheduled pod/sample-harbor-registryctl-5b78575b9c-jbwpr Successfully assigned default/sample-harbor-registryctl-5b78575b9c-jbwpr to node-67c9b53b-8e68-4f3d-976a-7499a466ca1f
6s Normal SuccessfulDelete replicaset/sample-harbor-registryctl-565485c695 Deleted pod: sample-harbor-registryctl-565485c695-c5l6t
6s Normal ScalingReplicaSet deployment/sample-harbor-registryctl Scaled down replica set sample-harbor-registryctl-565485c695 to 0
6s Normal Killing pod/sample-harbor-registryctl-565485c695-c5l6t Stopping container registryctl
4s Normal Pulling pod/sample-harbor-registryctl-5b78575b9c-jbwpr Pulling image "goharbor/harbor-registryctl:v2.0.0"
3s Normal Started pod/sample-harbor-registryctl-5b78575b9c-jbwpr Started container registryctl
3s Normal Created pod/sample-harbor-registryctl-5b78575b9c-jbwpr Created container registryctl
3s Normal Pulled pod/sample-harbor-registryctl-5b78575b9c-jbwpr Successfully pulled image "goharbor/harbor-registryctl:v2.0.0" in 1.272386998s
and this is the list of replicasets when I reproduced the issue:
NAME DESIRED CURRENT READY AGE
sample-harbor-registryctl-565485c695 0 0 0 3m34s
sample-harbor-registryctl-5b78575b9c 1 1 1 3m10s
sample-harbor-registryctl-656677c 0 0 0 3m35s
Hi Steven,
After more digging, I found out that the registryctl is redeployed because the registry custom ressource is modified. The default.registry.checksum.goharbor.io/sample-harbor is changing inside the registrycontroller custom ressource at the same time that the new replicaset is created.
I also found that the resourceVersion is modified inside the registry custom ressource, but I currently don't know why yet.
Hi Steven,
We identified the root cause of the issue. The operator is watching the secret of the registry internal certificate. When the secret is created, it is created empty and then populated with the certificate. For the operator the secret exist, therefore he deploy the registry and the registryctl. When the certificate is inserted in the secret the operator detect the modification and change the resource version of the registry, which trigger a redeploy of the registryctl because the checksum of the registry have changed.
I will modify the code of the operator in charge of checking if the secret exist or not.
@sguyennet
Any progress on this bug?
@sguyennet @holyhope
PING! Any updates about the fix to this issue?
Hi @steven-zou, The issue with the replicas et is fixed but we modified the way the objects are created and updated in Kubernetes. This introduced other bugs. We are currently working to solve those.
Run
make sample
Then check the resource, you'll find more than one registry-ctl replica sets that with 0, 0.
Deployment is not ready: