The builder restarted during the clusters' constructions.
NAME READY STATUS RESTARTS AGE
ansibler-599cb5b7b7-25hzm 1/1 Running 0 6h14m
builder-657499cc75-zsqw2 1/1 Running 1 (5h49m ago) 6h14m
claudie-operator-7b88589ff9-lhwlf 1/1 Running 0 6h14m
As you can see on the logs below it finished building the GCP cluster in the test-set-no1. It was supposed to start building the OCI cluster in the test-set-no1 right after it finished the GCP cluster. But it didn't and waited for about 2 hours.
...
2024-09-19T10:10:17Z INF ../go/services/builder/domain/usecases/config_processor_v2.go:94 > Finished processing task "1b77fa23-c7cd-48b3-ac0a-802de7ce57ff" for cluster "ts1-gcp-clus
ter-test-set-no1" config "claudie-9cb8ac3-2971-test-set1" module=builder
2024-09-19T10:45:07Z DBG ../go/services/builder/domain/usecases/workflow_helpers.go:164 > updating task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-cluster-test-set-no-5" for c
onfig "claudie-9cb8ac3-2971-test-set5" with state: stage:ANSIBLER status:IN_PROGRESS module=builder
2024-09-19T10:45:07Z DBG ../go/services/builder/domain/usecases/workflow_helpers.go:164 > updating task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-cluster-test-set-no-5" for c
onfig "claudie-9cb8ac3-2971-test-set5" with state: stage:DESTROY_TERRAFORMER status:IN_PROGRESS description:"destroying infrastructure" module=builder
2024-09-19T10:45:07Z INF ../go/services/builder/domain/usecases/terraformer_caller.go:62 > Calling DestroyInfrastructure on Terraformer cluster=hybrid-cluster-test-set-no-
5-zkqpq84 module=builder project=claudie-9cb8ac3-2971-test-set5
2024-09-19T10:46:33Z INF ../go/services/builder/domain/usecases/terraformer_caller.go:66 > DestroyInfrastructure on Terraformer finished successfully cluster=hybrid-cluste
r-test-set-no-5-zkqpq84 module=builder project=claudie-9cb8ac3-2971-test-set5
2024-09-19T10:46:33Z DBG ../go/services/builder/domain/usecases/workflow_helpers.go:164 > updating task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-cluster-test-set-no-5" for c
onfig "claudie-9cb8ac3-2971-test-set5" with state: stage:DESTROY_TERRAFORMER status:IN_PROGRESS module=builder
2024-09-19T10:46:33Z DBG ../go/services/builder/domain/usecases/workflow_helpers.go:164 > updating task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-cluster-test-set-no-5" for c
onfig "claudie-9cb8ac3-2971-test-set5" with state: stage:DESTROY_KUBER status:IN_PROGRESS description:"deleting kubeconfig secret" module=builder
2024-09-19T10:46:33Z INF ../go/services/builder/domain/usecases/kuber_caller.go:137 > Calling DeleteKubeconfig on Kuber cluster=hybrid-cluster-test-set-no-5-zkqpq84 mo
dule=builder project=claudie-9cb8ac3-2971-test-set5
2024-09-19T10:46:33Z DBG ../go/services/builder/domain/usecases/workflow_helpers.go:164 > updating task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-cluster-test-set-no-5" for c
onfig "claudie-9cb8ac3-2971-test-set5" with state: stage:DESTROY_KUBER status:IN_PROGRESS description:"deleting cluster metadata secret" module=builder
2024-09-19T10:46:33Z INF ../go/services/builder/domain/usecases/kuber_caller.go:144 > Calling DeleteClusterMetadata on kuber cluster=hybrid-cluster-test-set-no-5-zkqpq84
module=builder project=claudie-9cb8ac3-2971-test-set5
2024-09-19T10:46:33Z INF ../go/services/builder/domain/usecases/kuber_caller.go:148 > DeleteKubeconfig on Kuber finished successfully cluster=hybrid-cluster-test-set-no-5-
zkqpq84 module=builder project=claudie-9cb8ac3-2971-test-set5
2024-09-19T10:46:33Z DBG ../go/services/builder/domain/usecases/workflow_helpers.go:164 > updating task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-cluster-test-set-no-5" for c
onfig "claudie-9cb8ac3-2971-test-set5" with state: stage:KUBER status:IN_PROGRESS module=builder
2024-09-19T10:46:33Z INF ../go/services/builder/domain/usecases/config_processor_v2.go:52 > successfully processed task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-cl
uster-test-set-no-5" for config "claudie-9cb8ac3-2971-test-set5" module=builder
2024-09-19T10:46:33Z DBG ../go/services/builder/domain/usecases/config_processor_v2.go:60 > updating current state for cluster "hybrid-cluster-test-set-no-5" for config "claudie-9cb8ac3-2971-
test-set5" task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" module=builder
2024-09-19T10:46:33Z DBG ../go/services/builder/domain/usecases/config_processor_v2.go:77 > updating task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-cluster-test-set-no-5" for
config "claudie-9cb8ac3-2971-test-set5" with status: DONE module=builder
2024-09-19T10:46:33Z INF ../go/services/builder/domain/usecases/config_processor_v2.go:94 > Finished processing task "d7079d7b-83dd-4207-980c-71b33b2d2b7c" for cluster "hybrid-clust
er-test-set-no-5" config "claudie-9cb8ac3-2971-test-set5" module=builder
2024-09-19T11:54:20Z DBG ../go/services/builder/domain/usecases/config_processor_v2.go:133 > [task "371c8509-6fb1-459f-b700-cc485da1a4a8"] Update operation "ts1-oci-cluster-test-set-no1" from
config "claudie-9cb8ac3-2971-test-set1" module=builder
2024-09-19T11:54:20Z DBG ../go/services/builder/domain/usecases/workflow_helpers.go:164 > updating task "371c8509-6fb1-459f-b700-cc485da1a4a8" for cluster "ts1-oci-cluster-test-set-no1" for c
onfig "claudie-9cb8ac3-2971-test-set1" with state: stage:TERRAFORMER status:IN_PROGRESS description:"building infrastructure" module=builder
2024-09-19T11:54:20Z INF ../go/services/builder/domain/usecases/terraformer_caller.go:27 > Calling BuildInfrastructure on Terraformer cluster=ts1-oci-cluster-test-set-no1-
050cz7r module=builder project=claudie-9cb8ac3-2971-test-set1
2024-09-19T11:55:39Z INF ../go/services/builder/domain/usecases/terraformer_caller.go:32 > BuildInfrastructure on Terraformer finished successfully cluster=ts1-oci-cluster
-test-set-no1-050cz7r module=builder project=claudie-9cb8ac3-2971-test-set1
...
This resulted in the stuck building infrastructure for the OCI cluster in the test-set-no1.
Besides that, the e2e pipeline failed because it took too long to finish the test sets.
2024-09-19T09:06:27Z ERR claudie_test.go:125 > Error in test sets test-set3 error="error while monitoring manifest 1.yaml from test set test-set3 : test took too long... Aborting after 8000 seconds" module=testing-framework
2024-09-19T09:11:19Z ERR claudie_test.go:125 > Error in test sets test-set2 error="error while monitoring manifest 1.yaml from test set test-set2 : test took too long... Aborting after 8000 seconds" module=testing-framework
2024-09-19T09:13:10Z ERR claudie_test.go:147 > Error in test sets autoscaling-1 error="error while performing additional test for manifest 1.yaml from autoscaling-1 : test took too long... Aborting after 8000 seconds" module=testing-framework
panic: test timed out after 3h0m0s
running tests:
TestClaudie (3h0m0s)
Expected Behaviour
At first builder should finish the process of building the cluster. Then it can restart.
Current Behaviour
The
builder
restarted during the clusters' constructions.As you can see on the logs below it finished building the GCP cluster in the test-set-no1. It was supposed to start building the OCI cluster in the test-set-no1 right after it finished the GCP cluster. But it didn't and waited for about 2 hours.
This resulted in the stuck building infrastructure for the OCI cluster in the test-set-no1.
Besides that, the e2e pipeline failed because it took too long to finish the test sets.
Expected Behaviour
At first
builder
should finish the process of building the cluster. Then it can restart.Steps To Reproduce
I have no idea.
Anything else to note
Nothing.