reactive-tech / kubegres

Kubegres is a Kubernetes operator allowing to deploy one or many clusters of PostgreSql instances and manage databases replication, failover and backup.
https://www.kubegres.io
Apache License 2.0
1.32k stars 74 forks source link

Lots of statefulsets? #57

Closed joes closed 3 years ago

joes commented 3 years ago

Hi,

I seem to get a lot of "singleton" statefulsets, i.e. a lot of statefulsets that define spec.replicas: 1 for each replica defined in the kubegres-resource. Or, in other words, each kubegres replica results in one statefulset that manage exactly one pod/replica. Is this intended (as designed) or a bug?


To exemplify, spec.replicas: 3 in the kubegres-resource, here:

apiVersion: kubegres.reactive-tech.io/v1
kind: Kubegres
metadata:
  name: example-postgres-db
  namespace: db

spec:
  replicas: 3

Results in these three statefulsets (kubectl get statefulset.apps -n db):

NAME                         READY   AGE
example-postgres-db-1   1/1     28m
example-postgres-db-2   1/1     27m
example-postgres-db-3   1/1     27m

Each of the above statefulsets are defined with spec.replicas: 1.

alex-arica commented 3 years ago

Thank you for your message.

I confirm that it is an intended design. We mentioned about StatefulSets in the Getting Started page.

The reason why we create one StatefulSet per Pod is because at some stage a Replica Pod might be promoted as a Primary. And we could not do that if one StatefulSet was managing all Pods. That's because Kubegres does not have access to the way a StatefulSet manages its associated Pod(s).

Kubegres set properties to StatefulSets which then update its associated Pod. Basically Kubegres never updates directly a Pod and it delegates the lifecycle of a Pod to a StatefulSet.

joes commented 3 years ago

So the statefulsets - as used by Kubegres - are only placeholders/proxies for a single pod and the Kubegres-resource itself actually serves more or less as a statefulset in the terminology of Kubernetes?

My confusion arose from trying to understand the design by applying my understanding of the standard definition of statefulsets here. Such confusion might perhaps be alleviated by placing a note/warning in the documentation that the standard definition and usage of statefulsets does not really apply in this instance.

However, I would like to understand what Kubegres gains by using statefulsets as definitions/stand-ins for pods if possible. What does the statefulset design provide? Why does not kubegres just create and manage pods itself directly?

If statefulsets provided more API:s that provided access to the way it manages its associated Pod(s) - would you then consider using statefulsets as "intended"?

Thanks for your kind assistance so far. Kubegres is now up and running and seems to the job. I really do appreciate the more straightforward approach of Kubegres when compared to other Postgres-operators.

alex-arica commented 3 years ago

Kubegres manages StatefulSet resources and also other resources such as ConfigMap, Services and CronJob (if backup is enabled).

StatefulSet offers many features which are not available from a Pod such as assigning a stable network identity to each StatefulSet and managing volumeClaimTemplates, among many.

If Kubegres had to manage the Pods directly, we will have to write those feature ourselves.

In regards to a warning, I am not convinced it would be necessary as this approach does not add any overhead nor concerns to the resources in Kubernetes. And in the documentation we highlighted the facts that one StatefulSet is created for each Pod in the section listing the created resources.

joes commented 3 years ago

Ok, I hear you and thank you for taking the time to answer my questions.