ydb-platform / ydb-kubernetes-operator

YDB Operator allows you to deploy your own YDB cluster in Kubernetes
Other
47 stars 18 forks source link

feat: Improve operator for day-2 tasks #170

Open tunatoksoz opened 10 months ago

tunatoksoz commented 10 months ago

Feature Request

Describe the Feature Request Operator could enable day-2 workflows declaratively. Some of the issues are called out in the docs with [1] but they seem to be fairly limited. I have been using CrunchyData's postgres operator, and looked into CloudNativePG operator as well, and they offer a pretty good blueprint of what could work.

  1. Backups. Following the example from crunchydata, backup section could take a secret for S3 credentials, frequency of how often to take a full back up and an incremental backup, and other configuration for PITR

    backups:
    pgbackrest:
      configuration:
      - secret:
          name: pgo-gcs-creds
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.47-0
      manual:
        options:
        - --type=full
        repoName: repo1
      repos:
      - name: repo1
        s3:
          bucket: <BUCKET>
          endpoint: <ENDPOINT>
          region: <REGION>
        schedules:
          full: 0 7 */2 * *
  2. Storage size increase Crunchy allows for increasing # of replicas and storage size for each of the instances. Based on YDB docs, it's not possible to update the manifest and apply changes. I have not tested if manually increasing PersistentVolumeClaim size works, but doing this at manifest would be idea.

  3. Affinity I believe operator supports node affinity / through the CRD, but it's not called in documentation, so one has to check the CRD definition. An example would probably also suffice.

  4. Users/Databases This is not a strict must, and Database is a separate CRD object, so it may be moot somewhat, but crunchy encapsulates database and users in same CRD.

  5. Similar to (4), Crunchy encapsulates monitoring (exporter) and administraive UI (pgadmin) in a single CRD object, could work better.

[1] https://ydb.tech/docs/en/getting_started/kubernetes#:~:text=The%20cluster%20configuration%20is%20static

Describe Preferred Solution More capable CRD would help. CLI is fine, but hard with devops.

If the feature request is approved, would you be willing to submit a PR? No. Not sure i know the tool enough to build the operator on it.

Jorres commented 10 months ago

@tunatoksoz thank you for the feedback!

We have been thinking about the improvements 1/2 as well, even though it is not high on our list of priorities. Backups in larger YDB installations are currently managed by a separate control plane component, which is likely to be opensourced in the near future as well, so we'd like to give it some time to see where the borders of responsibility lie between the operator and the said control plane component.

The one suggestion that is the easiest to fix would be 3, I will make sure the documentation includes an example on how to set up node affinity (because it is indeed supported).

The 4/5 ones are unlikely to change in the immediate future, as you've said, it's not a strict must. However the 4 idea does sound appealing to me, I'll make sure it reaches the team and we'll give it a better thought.