trinodb / charts

Apache License 2.0
151 stars 173 forks source link

feature request: choosing statefulSet instead of deployment #191

Closed Viktor3434 closed 4 months ago

Viktor3434 commented 4 months ago

Hello! Please add the ability to choose statefulset instead of deployment. if you can :)

UPD: for workers I need VolumeClaimTemplates

Viktor3434 commented 4 months ago

This is needed to store cache and spills when deploying the chart to the cloud.

It should be easy to add a switch from deployment to statefulset. You can take all the configuration from the deployment, and add the switch (in valuess) to the deployment section

in templates it is 2 if conditions 1 for kind 2 for volumeClaimTemplates 3 for volumeMounts

or add a separate template with a small modification.

nineinchnick commented 4 months ago

Isn't cache and spills ephemeral? Why would you need to bind it to specific pods?

Are you interested in contributing this?

Viktor3434 commented 4 months ago

This is what I saw in the documentation

https://trino.io/docs/current/admin/properties-spilling.html#spiller-spill-path

It is not recommended to spill to system drives. Most importantly, do not spill to the drive on which the JVM logs are written, as disk overutilization might cause JVM to pause for lengthy periods, causing queries to fail.

https://trino.io/docs/current/object-storage/file-system-cache.html#recommendations

In all cases, avoid using the root partition and disk of the node. Instead attach one or more dedicated storage devices for the cache on each node. Storage should be local, dedicated on each node, and not shared.

Also

https://github.com/trinodb/trino/discussions/22467

I can provide a template, but I don't want to sign anything. For example, I can add it here in the comments.

nineinchnick commented 4 months ago

We can't use any code you write if you don't sign the CLA.

The docs say to use a separate disk device to avoid saturating the IO, it doesn't mention anything about reusing the same spilled or cached data after a node restarts.

You should still be able to use automatically provisioned volumes with deployments, and make sure you have some policy to clean up unused ones.

mosabua commented 4 months ago

I am not sure about the motivation for this change. It works perfectly fine to use spilling (not recommended but nevertheless) or file system caching as it stands now. What would the change to statefulSet actually fix or improve.

Viktor3434 commented 4 months ago

Hi! We are assuming cache + spills to be 200+ GB. Kubernetes node volume is simply not enough in a normal configuration.

I will prepare a PR if you are willing to consider it.

I also apologize for my English :)

Viktor3434 commented 4 months ago

what do you think?

nineinchnick commented 4 months ago

You don't need StatefulSets for generic ephemeral volumes: https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/#generic-ephemeral-volumes

Viktor3434 commented 4 months ago

Yeah, thanks. I didn't know about that feature.