Trino pods goes down instantly while autoscale factor causes pods to terminate even if `terminationGracePeriodseconds` is set to 300 seconds

trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

https://trino.io

Apache License 2.0

9.83k stars 2.85k forks source link

Trino pods goes down instantly while autoscale factor causes pods to terminate even if `terminationGracePeriodseconds` is set to 300 seconds #22483

Open hsushmitha opened 4 days ago

hsushmitha commented 4 days ago

we have set terminationGracePeriodSeconds to 300s in trino coordinator and worker nodes. during autoscaling when the number of worker pods increase and decrease, pods terminate instantly without waiting for the queries in the pod to terminate. we have set shutdown.grace-period=300s in trino cooridnator and worker also. Expectation is the trino worker pods must wait for 300sec untill tasks in the worker complete instead of terminating instantly.

we have set starburstWorkerShutdownGracePeriodSeconds: 300 which corresponds to shutdown.grace-period=300s and deploymentTerminationGracePeriodSeconds: 300 which corresponds to terminationGracePeriodSeconds in starburst and the worker pods terminate after 300sec waiting for query tasks to run to completion as expected.

nineinchnick commented 4 days ago

Is this about the Trino Helm chart? If yes, can you include the values to reproduce this?

hsushmitha commented 4 days ago

it is about Trino Helm Chart. Attaching deployment config and values file for reproducing the issue.

values.txt deployment-coordinator.txt deployment-worker.txt

nineinchnick commented 4 days ago

Which chart version you're using? How do you apply the changes you included in deployment-*.txt files?

In the latest chart version, you have to set coordinator.terminationGracePeriodSeconds and worker.terminationGracePeriodSeconds. See https://trinodb.github.io/charts/charts/trino/

hsushmitha commented 4 days ago

we are using helm chart version: trino-0.8.0 we do helm upgrade trino . -f values.yaml -n trino and deploy the changes. the above attached files are yaml files.. since we couldn't attach yaml files we attached txt file version.

nineinchnick commented 4 days ago

That's very old. I don't know how the chart was structured back then, and I can't help anymore. Can you try using the latest version?