NREL / openstudio-server-helm

Helm charts for Kubernetes deployment of OpenStudio Server.
Other
10 stars 10 forks source link

helm delete doesn't fully terminate on AWS #22

Closed craigers290 closed 2 years ago

craigers290 commented 3 years ago

The following command helm delete openstudio-server --no-hooks results in "rserve" and "web" stay in a status of "Terminating". This results in needing to delete the Kubernetes cluster and launch a new one before being able to redeploy OS Server.

Expected Behavior: execute helm delete then helm install without needing to delete the Kubernetes cluster.

image

tijcolem commented 3 years ago

Thanks for reporting this @craigers290. This is almost certainly due to how rserve, web and web-background pods use nfs client to mount the nfs drive which is no longer available once the nfs server pod is deleted. It's order dependent in that the client needs to disconnect and umount the drives first, and then the nfs server can go down. Otherwise, under the current configs, the client just sits there and waits for the nfs server to come back and mount the drive.

The hook that is added does the graceful shutdown of these services by deleting the pods in the proper order and then terminating the nfs server. I don't think you want to skip the hooks by adding --no-hooks unless you have a customized workflow.

The nfs client does have some options to eventually give up trying to reconnect to nfs server, but when I tried using these options it wasn't working and releasing the mount, so I implemented the hooks instead.

tijcolem commented 2 years ago

Closing this as the behavior should be use hooks which gracefully terminate the nodes that mount the nfs volume.