noobaa / noobaa-operator

Operator for NooBaa - object data service for hybrid and multi cloud environments :cloud: :wrench:
https://www.noobaa.io
Apache License 2.0
101 stars 100 forks source link

Helmization of noobaa operator #305

Open yaroslavkasatikov opened 4 years ago

yaroslavkasatikov commented 4 years ago

Hello team,

Faced to several issue while was trying to create helm chart for creating resources. I have attached yaml files which were generated by Helm.

list of issues: 1) If I created all resources ( noobaa, backingstores, bucketclasses, storage classes and obc) simultaneously, I see that all my backingstores are rejected because there is no 'noobaa' object created yet and this resources leave it this status despite noobaa is already created. It blocks everything to continue.

2) Backingstores removal. I delete helm chart with all resources (except noobaa, because I moved it out from the chart according to reason from #1. All obc, sc and bucketclasses are remove, but Backingstores stuck:

  Conditions:
    Last Heartbeat Time:   2020-05-15T13:44:23Z
    Last Transition Time:  2020-05-15T14:10:44Z
    Message:               DeletePoolAPI cannot complete because pool "pvc-pool-2" has buckets attached
    Reason:                ResourceInUse
    Status:                Unknown
    Type:                  Available
    Last Heartbeat Time:   2020-05-15T13:44:23Z
    Last Transition Time:  2020-05-15T14:10:44Z
    Message:               DeletePoolAPI cannot complete because pool "pvc-pool-2" has buckets attached
    Reason:                ResourceInUse
    Status:                False
    Type:                  Progressing
    Last Heartbeat Time:   2020-05-15T13:44:23Z
    Last Transition Time:  2020-05-15T14:10:44Z
    Message:               DeletePoolAPI cannot complete because pool "pvc-pool-2" has buckets attached
    Reason:                ResourceInUse
    Status:                True
    Type:                  Degraded
    Last Heartbeat Time:   2020-05-15T13:44:23Z
    Last Transition Time:  2020-05-15T14:10:44Z
    Message:               DeletePoolAPI cannot complete because pool "pvc-pool-2" has buckets attached
    Reason:                ResourceInUse
    Status:                Unknown
    Type:                  Upgradeable

So there are no obc:

[yaroslav@yaroslav noobaa-resources]$ oc get obc

No resources found. But buckets are still exist in UI (see screenshoft 'bucket.png') So it blocks correct removal.

3) it's not related to helm, but I got the issue with csv:

[yaroslav@yaroslav noobaa-resources]$ oc get csv
NAME                                   DISPLAY                        VERSION   REPLACES                               PHASE
lib-bucket-provisioner.v1.0.0          lib-bucket-provisioner         1.0.0                                            Failed
noobaa-operator.v2.1.0                 NooBaa Operator                2.1.0     noobaa-operator.v2.0.10                Succeeded
openshift-pipelines-operator.v0.11.2   OpenShift Pipelines Operator   0.11.2    openshift-pipelines-operator.v0.10.7   Succeeded

Status:

  Version:  1.0.0
Status:
  Certs Last Updated:  <nil>
  Certs Rotate At:     <nil>
  Conditions:
    Last Transition Time:  2020-05-14T14:57:16Z
    Last Update Time:      2020-05-14T14:57:16Z
    Message:               OwnNamespace InstallModeType not supported, cannot configure to watch own namespace
    Phase:                 Failed
    Reason:                UnsupportedOperatorGroup
  Last Transition Time:    2020-05-14T14:57:16Z
  Last Update Time:        2020-05-14T14:57:16Z
  Message:                 OwnNamespace InstallModeType not supported, cannot configure to watch own namespace
  Phase:                   Failed
  Reason:                  UnsupportedOperatorGroup

I installed with this subscription yaml:

apiVersion: v1
kind: Namespace
metadata:
  name: noobaa
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: operatorgroup
  namespace: noobaa
spec:
  targetNamespaces:
  - noobaa
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: noobaa
  namespace: noobaa
spec:
  channel: alpha
  name: noobaa-operator
  source: operatorhubio-catalog
  sourceNamespace: openshift-marketplace

So, please feel free to ask any question or additional details.

Yaml which has been generated by Helm:

# Source: noobaa/templates/noobaa-operator.yaml
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: noobaa
  namespace: noobaa
spec:
  channel: alpha
  name: noobaa-operator
  source: operatorhubio-catalog
  sourceNamespace: openshift-marketplace
---
# Source: noobaa/templates/noobaa-sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: noobaa-pvc-pool-1.noobaa.io
provisioner: noobaa.noobaa.io/obc
reclaimPolicy: Delete
parameters:
  bucketclass: pvc-pool-1
---
# Source: noobaa/templates/noobaa-sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: noobaa-pvc-pool-2.noobaa.io
provisioner: noobaa.noobaa.io/obc
reclaimPolicy: Delete
parameters:
  bucketclass: pvc-pool-2
---
# Source: noobaa/templates/noobaa-resource.yaml
#apiVersion: noobaa.io/v1alpha1
#kind: NooBaa
#metadata:
#  labels:
#    app: noobaa
#  name: noobaa
#  namespace: noobaa
#
#spec:
#  dbResources:
#    requests:
#      cpu: "1"
#      memory: 1Gi
---
# Source: noobaa/templates/pvpool.yaml
apiVersion: noobaa.io/v1alpha1
kind: BackingStore
metadata:
  finalizers:
  - noobaa.io/finalizer
  labels:
    app: noobaa
  name: pvc-pool-1
  namespace: noobaa
spec:
  pvPool:
    numVolumes: 3
    resources:
      requests:
        storage: 30G
    storageClass: gp2
  type: pv-pool
---
# Source: noobaa/templates/pvpool.yaml
apiVersion: noobaa.io/v1alpha1
kind: BackingStore
metadata:
  finalizers:
  - noobaa.io/finalizer
  labels:
    app: noobaa
  name: pvc-pool-2
  namespace: noobaa
spec:
  pvPool:
    numVolumes: 3
    resources:
      requests:
        storage: 30G
    storageClass: gp2
  type: pv-pool
---
# Source: noobaa/templates/bucketclass.yaml
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  labels:
    app: noobaa
  name: pvc-pool-1
  namespace: noobaa
spec:
  placementPolicy:
    tiers:
    - backingStores:
      - pvc-pool-1
---
# Source: noobaa/templates/bucketclass.yaml
apiVersion: noobaa.io/v1alpha1
kind: BucketClass
metadata:
  labels:
    app: noobaa
  name: pvc-pool-2
  namespace: noobaa
spec:
  placementPolicy:
    tiers:
    - backingStores:
      - pvc-pool-2
---
# Source: noobaa/templates/obc-pool.yaml
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  name: bucket1
  namespace: noobaa
spec:
  generateBucketName: bucket1
  storageClassName: noobaa-pvc-pool-1.noobaa.io
---
# Source: noobaa/templates/obc-pool.yaml
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  name: bucket2
  namespace: noobaa
spec:
  generateBucketName: bucket2
  storageClassName: noobaa-pvc-pool-2.noobaa.io
yaroslavkasatikov commented 4 years ago

buckets

guymguym commented 4 years ago

Hi @yaroslavkasatikov

Thanks for trying that out and reporting back! BTW are you planning to publish this chart to helm repo?

Regarding your points:

1 - You're right, we haven't tried creating all those in the same time yet, but it's an important to support it well. The missing piece is that currently our BackingStore and BucketClass controllers do not watch for changes to the NooBaa CR. However this is easy to add in a similar way that the BucketClass controller is watching for BackingStores and queueing the "affected" BucketClasses, see here: https://github.com/noobaa/noobaa-operator/blob/984f9c91400f10bb02032df20618e2c5d3b81c12/pkg/controller/bucketclass/bucketclass_controller.go#L43-L50

2 - OK so this happens because we have a finalizer on BackingStores so that the noobaa-operator can update noobaa-core that this resource was deleted. However since your chart also removes the OLM Subscription then the operator is deleted already and cannot respond to that poor stuck BackingStore. The same happens if you delete the entire namespace that includes the operator and noobaa core running. This is essentially the root CRD problem of helm which you can read about it here - https://helm.sh/docs/chart_best_practices/custom_resource_definitions. So if you would use an OLM dependency (more complex than helm) then it would order those deps for you, but for helm even if we separate to 2 charts the creation of the second chart may fail because the OLM Subscription is asynchronously deploying the operator and the CRDs. Also I seem to read from this issue that helm will not delete those charts in order anyhow - https://github.com/helm/helm/issues/6283.

So I am not sure we can use OLM to install the noobaa-operator for helm. Perhaps we need to use the yamls from noobaa crd yaml and noobaa operator yaml to deploy the operator, and then have a separate chart for the noobaa system and it's configuration.

Let me know what you think as this is an involved matter.

3 - This last problem is because the lib-bucket CSV is specifying it supports only the AllNamespaces install mode, but noobaa-operator and the OperatorGroup that we use support only the OwnNamespace install mode. I submitted a PR to fix it on community operators - https://github.com/operator-framework/community-operators/pull/1749, not sure how it slipped off (because I definitely remember fixing it before, maybe just for openshift). BTW changing to separate charts like we described in previous point, and deploying the operator directly without OLM and operatorhub, will resolve this too as those CRDs from the lib-bucket package are also included in the noobaa crd yaml output.

Thanks!

yaroslavkasatikov commented 4 years ago

Hi @guymguym !

Many thanks for your reply. I'm really appreciate it.

As for charts publishing, probably in advance yes, but for now it doesn't work without fix for no.1. I totaly agree if all resources are linked to both side for each other. Seems, that all resources should be in 'pending' state until their dependencies are solved. It will be really wonderful if you can add this opportunity.

As for changing installation method - yes, I will try and return to you, but as you mentioned above, I separated charts for opertators installer and noobaa resources. As for operators - I found only one issue (no. 3) which is known and you are working on it. As for resources I found that I need to split it for 3 charts and install it in this order: 1) Operator (let's call helm release name operator) 2) NooBaa ( noobaa) 3) SC, BackingStores, BucketClasses, OBC (noobaa-resources)

So, in case if I have all charts installed, and want to delete chart noobaa-resources (for example, I want to recreate resources or so), I got this state.

1) operartor is working, because helm release operator wasn't affected. It's fine. 2) noobaa resource (and all it's PODs like noobaa-db, noobaa-core) is up and running. It's also fine 3) all obc resources are removed but buckets which were configured by these obc resources are still beeing up and accessible. I can see them in UI and connect to them through aws-cli. It's an issue from my point of view. 4) all backingstores resources are in 'Rejected' state, because buckets are not really removed. But if I remove bucket in UI, the backingstores resource are released and removed successfully. That is the issue too. 5) I don't remember exactly what happenes with bucketclasses, will recheck tomorrow, but I don't remember issues with them. 6) Storage classes are deleted.

Several words here about of target state. I want to install noobaa operator and control its resoucres in GitOps model (using ArgoCD). So, all this resoureces should be agreed upon declarative model. It's OK to have temporary errors while deployment or pending resources, but in the finish I expect that state will be according to declared (i.e. buckets are created or removed and so).

So, I will try to install operator from crd (but not sure it can solve the isssue with removing) and return back

Many thanks again,

Yaroslav

yaroslavkasatikov commented 4 years ago

Hi @guymguym I tryied to create operator with yaml, but the issue is the same. As I told you before, operator installed successfully and the issue is the link between noobaa resource and other.

And one more question about no.1. Do you plan to add this feature:

The missing piece is that currently our BackingStore and BucketClass controllers do not watch for changes to the NooBaa CR.

guymguym commented 4 years ago

Hi @yaroslavkasatikov Sorry for the delay. I will check it out this week and update.

aelbarkani commented 4 years ago

Hi ! Any update on this issue ?

jcpunk commented 1 year ago

I'll confess I'd love to see an official chart for this up at artifacthub.io... My site is trying to standardize on Helm packages to load 3rd party objects. OLM looks interesting, but....

guymguym commented 1 year ago

@jcpunk Thanks for the comment and pointing out your interest in supporting this.

We had this slack discussion about using noobaa install yaml > manifests.yaml and using that as the helm chart basis. However this is somewhat of a general problem with converting an operator deployment to a helm chart. The problem is off course that you need the operator to keep working to cleanup the noobaa system resources, and only then remove the operator. One suggestion was to break this into separate helm charts - one for the operator deployment (as dumped from noobaa crd yaml and noobaa operator yaml) and another for the noobaa system deployment (noobaa system yaml). This will allow to cleanup in stages - first delete all the noobaa systems charts, and wait for them to cleanup by the operator, and only then go ahead and cleanup the operator chart.

I wonder if there's a more idiomatic pattern to support such operators that manage deployments in helm? Or the main approach is to avoid deploying operators with helm and deploy the system directly, which will be hard to maintain in conjunction with the operator pattern...

@nimrod-becker @dannyzaken WDYT about taking on this challenge to support vanilla kubernetes with helm?