helm / charts

⚠️(OBSOLETE) Curated applications for Kubernetes
Apache License 2.0
15.5k stars 16.83k forks source link

[stable/minecraft] Should be a statefulset, not a deployment #21230

Closed TJM closed 4 years ago

TJM commented 4 years ago

As per issue #1863 the storage is RWO, which means that a deployment is not good, it should be a statefulset.

Version of Helm and Kubernetes:

Which chart: stable/minecraft

What happened: upgrades (changing config) fails due to deployment trying to mount the storage twice.

What you expected to happen: It should stop the old pod, then start the new one. Apparently thats a statefulset?

How to reproduce it (as minimally and precisely as possible): helm upgrade --install mc stable/minecraft --set eula=true helm upgrade --install mc stable/minecraft -f values.yaml #(with other settings)

Anything else we need to know: See #1863 for details

CC: @itzg

itzg commented 4 years ago

I'll have to defer to @gtaylor or @billimek for their advice, but it sounds like you have encountered a legitimate issue @TJM .

FWIW, I have also found a StatefulSet (manually via kubectl apply) to be a nice way to manage my kube deploys of minecraft:

https://gist.github.com/itzg/c7aeaeadc07585df1a814405f6774988

however, a Deployment with separate PVC has worked fine for me too. I think it is Helm getting in way more than anything with being too aggressive with swapping out resources rather than patching the 1-replica ReplicaSet/Deployment/StatefulSet pod template.

billimek commented 4 years ago

There is a lot of history and controversy about Deployment vs StatefulSet for helm charts and some strong opinions on both sides.

To work around this specific issue with multi-attached storage, we can solve it by changing the update strategy to Recreate instead of using the default of RollingUpdate. See the unifi chart as an example about how this is done.

I'm happy to work-up a quick PR to make this change. It should alleviate the immediate problem with upgrades and we can continue to talk though Deployment vs Statefulset.

What do you think?

itzg commented 4 years ago

That works for me @billimek . Yeah, I skimmed that issue about the Deployment vs StatefulSet and IMHO the choice is fairly arbitrary since both strategies in the general kubernetes-sence equally support persistent volumes.

billimek commented 4 years ago

21272 raised for this.

TJM commented 4 years ago

I think changing the strategy to Recreate will help, and will effectively work the same. The main thing is we don't want two minecraft's running using the same PVC, which shouldn't be possible, but...

minecraft-minecraft-576c675b66-bsgk5:minecraft-minecraft [23:48:02] [Server thread/ERROR]: Couldn't save chunk; already in use by another instance of Minecraft?
minecraft-minecraft-576c675b66-bsgk5:minecraft-minecraft bjy: The save is being accessed from another location, aborting

I think it is Helm getting in way more than anything with being too aggressive with swapping out resources rather than patching the 1-replica ReplicaSet/Deployment/StatefulSet pod template.

I was also thinking that this may be the case, as I was trying to simply update an annotation (trying to use minecraft as a demo for velero backup). When the deployment is "patched" it seems to start new pods

TJM commented 4 years ago

Yeah, I skimmed that issue about the Deployment vs StatefulSet and IMHO the choice is fairly arbitrary since both strategies in the general kubernetes-sence equally support persistent volumes.

@itzg I think PVCs for deployments would usually be RWX (ReadWriteMany), like NFS. However, as per my error above, that would be bad for MC. As noted tho, using "Recreate" should make it safe with RWO (ReadWriteOnce) or block storage.

billimek commented 4 years ago

I think we can all agree that workloads like minecraft or similar things should never leverage shared storage or multiple replicas accessing the same storage.

Assuming this is a truth, it begs the question why not run as a statefulset which will, using defaults, protect against the above situation.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

TJM commented 4 years ago

I though this was resolved, or are we still on the fence as to whether to change from deployment/replicaset to statefulset?

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue is being automatically closed due to inactivity.