Open georgethebeatle opened 9 months ago
For context, app restart does not cause downtime in cases when the app is scaled to more than one instance.
When the statefulset gets updated by servicebinding.io, the statefulset controller restart its pods one by one (similarly to what we do on rolling restart)
We discussed this issue on the last working group sync and decided that it is not too critical, because
Given all of the above we think that it would be better to document the fact that Korifi would restart workloads when one binds services to them in the section dealing with differences from CF.
Given that we decided to accept the fact that binding/unbinding an app to a service instance would automatically restart the app, should we also restart the app when the service instance credentials change? This way we could simply document that korifi automatically restarts the apps whenever service details chages so that the app can immeditely see the change. Wouldn't such a behaviour be more consistent than auto restarting on bind/unbind but requiring a manual restart on credentials chage.
We spiked a bit on this idea and push the relevant changes here
the CF binding is composed by 2 phases:
So my proposal here is
In this way the restart in not done on the binding (not expected) but at the restage/restart time (expected).
In order to achieve the behaviour you suggest we would have to change the workload of the servicebinding.io binding. Currently, the binding workload is the statefulset. This results into servicebinding.io updating the statefulset pod template via its webhook which immediately triggers the statefulset controller to restart the pods.
We had an idea to change the binding workload to another resource (e.g. the AppWorkload), however: a) we need to spike on whether such an approach works at all b) have to ensure that existing bindings still work after Korifi upgrade c) if possible, avoid pod restarts during upgrade
Once/If we prove that such an approach is feasible, then we could change the appworkload (or whatever target resource we choose) controller to update the statefulset pod template only on restart/restage.
Bonus point of such an approach would be that Korifi makes no assumptions that workloads are backed up by statefulsets. This assumption currently is an issue if someone wants to bring their own workload runners (for example, a workload runner that uses deployments instead of statefulsets)
Background
In CF for vms when a user binds a service instance to an app the app does not immediately see the new service binding but needs to be restarted before the updated value of the VCAP_SERVICES env var is available to it. This is normal and expected as the process env is immutable. The cf cli outputs a tip with regards to this behaviour.
In Korifi binding a service to an app means two things:
servicebinding.servicebinding.io
object that mounts the datails of the service instance we are binding to as a mountpoint in the resulting pod where the application is running. This is a new pure kubernetes way of consuming the service instance and is specified in the servicebinding.io specWhat the
servicebinding
object does is that it injects new etries in thevolumes
andvolumeMounts
fields of the Pod or the PodTemplateSpec which effectively restarts the pod. This results in Korifi behaving slightly differently to classic CF by doing unexpected app restarts which might mean downtime for the users of the app. Another inconvenience is that if the user wants to bindn
service instances to the app the app will be restartedn
times.Reproduction steps
Acceptance
GIVEN I have deployed an app
a
to Korifi and have created a service instancesi
WHEN I runcf bind-service a si
THEN I can continue using the app without experiencing any downtime