Closed guptamridul1809 closed 4 months ago
Investigating. I see similar behavior with one difference - it's not being stuck waiting for Cassandra nslookup to succeed, but for schema being populated.
Based on this line, the job to perform the schema-setup will only start after everything is loaded. Which can't happen because the rest of the server components depend on having schema be set up.
Simply removing that line solves everything
@mindaugasrukas
I think those hooks are correct. According to this document: https://helm.sh/docs/topics/charts_hooks/#hooks-and-the-release-lifecycle, "loaded" doesn't mean it's blocking.
1. User runs helm install foo
2. The Helm library install API is called
3. CRDs in the crds/ directory are installed
4. After some verification, the library renders the foo templates
5. The library prepares to execute the pre-install hooks (loading hook resources into Kubernetes)
6. The library sorts hooks by weight (assigning a weight of 0 by default), by resource kind and
finally by name in ascending order.
7. The library then loads the hook with the lowest weight first (negative to positive)
8. The library waits until the hook is "Ready" (except for CRDs)
9. The library loads the resulting resources into Kubernetes. Note that if the --wait flag is set,
the library will wait until all resources are in a ready state and will not run the post-install hook
until they are ready.
10. The library executes the post-install hook (loading hook resources)
11. The library waits until the hook is "Ready"
12. The library returns the release object (and other data) to the client
13. The client exits
What does it mean to wait until a hook is ready? This depends on the resource declared in the hook.
If the resource is a Job or Pod kind, Helm will wait until it successfully runs to completion. And if the
hook fails, the release will fail. This is a blocking operation, so the Helm client will pause while the
Job is run.
For all other kinds, as soon as Kubernetes marks the resource as loaded (added or updated),
the resource is considered "Ready".
But I see your point of making it a non-hook job and load together with all other resources.
@guptamridul1809, @chaychoong, could you try adding --set debug=true
to the helm command and paste schema-setup and schema-update Job logs? For me, they report success, but the schema is actually not loaded. I want to make sure you are having the same issue.
Also, this issue is flaky on my side. Hence, I'm unsure if removing the helm.sh/hook
annotation is related here.
Also, could you paste the DB content for:
kubectl exec service/temporal-admintools -- cqlsh temporal-cassandra 9042 -k temporal -e "SELECT * FROM schema_update_history"
kubectl exec service/temporal-admintools -- cqlsh temporal-cassandra 9042 -k temporal -e "SELECT curr_version FROM schema_version"
kubectl exec service/temporal-admintools -- cqlsh temporal-cassandra 9042 -k temporal_visibility -e "SELECT * FROM schema_update_history"
kubectl exec service/temporal-admintools -- cqlsh temporal-cassandra 9042 -k temporal_visibility -e "SELECT curr_version FROM schema_version"
Closing due to lack of feedback. Please re-open if this issue persists with the current chart version.
What are you really trying to do?
I'm trying to run temporal using helm chart
helm install --set server.replicaCount=1 --set cassandra.config.cluster_size=1 --set prometheus.enabled=false --set grafana.enabled=false --set elasticsearch.enabled=false temporaltest . --timeout 150m
Describe the bug
Several pods are stuck in init state
I let it try and complete for over an hour, no progress though
On further inspection, it was found that all the pods are waiting for cassandra nslookup to succeed
Minimal Reproduction
Just running
helm install --set server.replicaCount=1 --set cassandra.config.cluster_size=1 --set prometheus.enabled=false --set grafana.enabled=false --set elasticsearch.enabled=false temporaltest . --timeout 150m
gives this error I tried with latest master and withrelease 1.12
to consider an older setup, both setups run into same issueEnvironment/Versions
OS:
Docker:
Minikube:
Helm:
Additional context