Question: Why start each replica in it's own StatefulSet?

benkroeger commented 2 years ago

I noticed that this operator creates a new StatefulSet for each Replica. If I'm not mistaking, you can configure a number of replicas in a StatefulSet - that would leave the scaling to Kubernetes.

So, unless I'm missing something, it appears to me that you can leverage "scaling a single StatefulSet" and thus reduce operator complexity.

Is there a specific reason for doing it the way it is currently implemented? Would you mind sharing?

2fst4u commented 2 years ago

This is definitely a concern. The new statefulsets are the reason for new PVCs being created which are mentioned in other issues.

The way it is currently implemented causes a couple of concerns:

Doesn't it mean the entire replica is built from scratch and has to replicate from another replica each time? If a statefulset was used in the conventional manner then the scheduled pod could remount the existing data.
It causes new PVCs to be created. If you have a big update rollout and a lot of drains on nodes, you can end up with dozens of unnecessary PVs.
It prevents the use of something like a pod disruption budget to help prevent too many pods going down and up too quickly.

Just putting these down in writing to help aid in reasons for changing the method.

teebu commented 2 years ago

I came here looking for an explanation for unattached pvcs and what to do about them. At some point these grow and grow for whatever reason.

Also, not sure, but I think if I start again and it picks up from 1 the data in the higher numbers ones doesn't match the higher ones.

reactive-tech / kubegres

Question: Why start each replica in it's own StatefulSet? #90