Enterprise vs Community Operator?

kmanev073 commented 4 years ago

I was wondering what are the differences between this operator and the enterprise one? I couldn't find any documentation on the topic. No pricing comparisons, no feature comparisons, no license comparisons... Is this community driver production ready? Clearly the enterprise should be...

hi-usui commented 4 years ago

Yes, Enterprise documentation is extremely vague and confusing, probably by design. I assume that MongoDB Inc. is using this project to hook in more customers for the Enterprise operator.

While this does not fully answer the question, I also had the same question which was partially answered here: https://github.com/mongodb/mongodb-kubernetes-operator/issues/145#issuecomment-676489060

As for whether this is production-ready or not, it seems it is getting closer, since this project now requires anyone who installs the latest v0.2.x release to have authentication enabled, without any option to disable it. I depend on this operator in a production system, and I never plan on using the Enterprise operator. This is also the only actual option for distributed MongoDB on Kubernetes without the Enterprise operator.

There have been some great advancements within the past few months and the maintainers seem to be working hard on adding more features. For example, previously you could not specify the size of your persistentVolumeClaim which was a huge oversight (10Gi default per volume? Ouch), but now you can. I'm sure more usability features are on the way.

kmanev073 commented 4 years ago

Yes, Enterprise documentation is extremely vague and confusing, probably by design. I assume that MongoDB Inc. is using this project to hook in more customers for the Enterprise operator.

While this does not fully answer the question, I also had the same question which was partially answered here: #145 (comment)

As for whether this is production-ready or not, it seems it is getting closer, since this project now requires anyone who installs the latest v0.2.x release to have authentication enabled, without any option to disable it. I depend on this operator in a production system, and I never plan on using the Enterprise operator. This is also the only actual option for distributed MongoDB on Kubernetes without the Enterprise operator.

There have been some great advancements within the past few months and the maintainers seem to be working hard on adding more features. For example, previously you could not specify the size of your persistentVolumeClaim which was a huge oversight (10Gi default per volume? Ouch), but now you can. I'm sure more usability features are on the way.

Cool and thanks! If you rely on this operator for production, how do you do backups and etc? Is there any good documentation because I spent around 3-4 hours and still couldn't run it.

hi-usui commented 4 years ago

Yes, Enterprise documentation is extremely vague and confusing, probably by design. I assume that MongoDB Inc. is using this project to hook in more customers for the Enterprise operator. While this does not fully answer the question, I also had the same question which was partially answered here: #145 (comment) As for whether this is production-ready or not, it seems it is getting closer, since this project now requires anyone who installs the latest v0.2.x release to have authentication enabled, without any option to disable it. I depend on this operator in a production system, and I never plan on using the Enterprise operator. This is also the only actual option for distributed MongoDB on Kubernetes without the Enterprise operator. There have been some great advancements within the past few months and the maintainers seem to be working hard on adding more features. For example, previously you could not specify the size of your persistentVolumeClaim which was a huge oversight (10Gi default per volume? Ouch), but now you can. I'm sure more usability features are on the way.

Cool and thanks! If you rely on this operator for production, how do you do backups and etc? Is there any good documentation because I spent around 3-4 hours and still couldn't run it.

Sure thing. What is your intended setup?

For backups, I use:

Rook-Ceph RBD for bare metal self-hosted storage
- https://github.com/rook/rook.git
Kubernetes CSI external snapshotter which connects with Rook-Ceph so you can do volumeSnapshot and volumeSnapshotContent
- https://github.com/kubernetes-csi/external-snapshotter.git
A cron Node.js script which essentially creates a snapshot and exports it to a raw disk image: kubectl -n rook-ceph exec ${rookCephToolsPod} -- bash -c "rm -f /snapshot.img && rbd export replicapool/csi-snap-${snapshotHandle} /snapshot.img and uploads it as a tgz to AWS Glacier S3 for cheap long-term storage
(For when you need to perform a snapshot restore) Once you have a raw image file, you can import it using rook-ceph-tools: sel=app=rook-ceph-tools; ns=rook-ceph; k -n $ns cp snapshot.img $(kubectl get pod -n $ns -l $sel --sort-by=.metadata.creationTimestamp | sed -n 2p | awk '{print $1}'):/snapshot.img; k -n $ns exec -it $(kubectl get pod -n $ns -l $sel --sort-by=.metadata.creationTimestamp | sed -n 2p | awk '{print $1}') -- rbd import /snapshot.img replicapool/csi-snap-1be36668-e5f7-11ea-8658-d2f514220151 where csi-snap-xxxx represents the handle connected to the volumeSnapshotContent
- Snapshot documentation: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#volume-snapshot-and-restore-volume-from-snapshot-support

Although I am not well-read about the different failure modes of volume snapshots, so far, every snapshot I have taken has been able to restore fine and stateful applications have no issue continuing from snapshots. I do not think anyone wants to make custom scripts for mongodump and mongorestore. You don't have to worry about how applications want you to backup/restore things, unless something critical stays in-memory for some time, in which case disk storage will never be able to help you anyway.

Is there any good documentation

I think the maintainers are currently working very hard (their prompt response time on issues is greatly appreciated) on trying to add MongoDB features and fix bugs, so understandably, documentation comes secondary until they catch up.

Here is a demo YAML file for a ReplicaSet using Rocket Chat, which is easy to set up and great for testing this MongoDB operator.

kubectl create secret generic k8s-secrets --from-env-file "$SECRETS_KUBERNETES_PATH" This creates a secret with key-value pairs from a text file such as:

ROCKETCHAT_MDB_PASSWORD=bklahsdkjaskjdhasdasd
...
...
SENDGRID_SMTP_USERNAME=1i91789261tg

Note: From what I observe, authentication credentials do not work until all replica set members are in the Ready state and the mongodb-kubernetes-operator applies to all 3 members at once. Also, currently you need to use "data-volume" for metadata.name for it to replace the default one. https://github.com/mongodb/mongodb-kubernetes-operator/issues/177 claims that it is fixed, but whenever I tried to use any name other than data-volume, 2 volumes per replica set member were still created. The ROOT_URL of https://usuih.com/chat is just the base route ingress-nginx controller forwards to (service port 11001). It will be different for anyone who wants to use this example.

apiVersion: mongodb.com/v1
kind: MongoDB
metadata:
  name: rocketchat-mdb
spec:
  members: 3
  type: ReplicaSet
  version: 4.2.9
  persistent: true
  security:
    authentication:
      modes: ["SCRAM"]
  users:
  - name: admin
    db: admin
    passwordSecretRef:
      name: k8s-secrets
      key: ROCKETCHAT_MDB_PASSWORD
    roles:
      - name: root
        db: local
      - name: readWrite
        db: local
      - name: root
        db: test
      - name: readWrite
        db: test
  statefulSet:
    spec:
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes: [ "ReadWriteOnce" ]
            resources:
              requests:
                storage: 1Gi

---

apiVersion: v1
kind: Service
metadata:
  name: rocketchat-srv-cluster-ip
  labels:
    component: rocketchat-srv-cluster-ip
spec:
  type: ClusterIP
  selector:
    component: rocketchat-srv
  ports:
    - name: rocketchat-srv-http
      port: 11001
      targetPort: http

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rocketchat-srv
spec:
  replicas: 1
  selector:
    matchLabels:
      component: rocketchat-srv
  template:
    metadata:
      labels:
        component: rocketchat-srv
    spec:
      initContainers:
        - name: rocketchat-mdb
          image: mongo:4.2.9
          env:
            - name: ROCKETCHAT_MDB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: k8s-secrets
                  key: ROCKETCHAT_MDB_PASSWORD
          command: ['sh', '-c', 'until mongo mongodb://admin:$(ROCKETCHAT_MDB_PASSWORD)@rocketchat-mdb-0.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-1.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-2.rocketchat-mdb-svc.default.svc.cluster.local:27017/?authSource=admin\&replicaSet=rocketchat-mdb --eval "print(\"successful connection\")"; do sleep 1; done;']
      containers:
      - name: rocketchat-srv
        image: rocketchat/rocket.chat:3.6.1
        env:
        - name: ROCKETCHAT_MDB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: k8s-secrets
              key: ROCKETCHAT_MDB_PASSWORD
        - name: ROOT_URL
          value: "https://usuih.com/chat"
        - name: MONGO_URL
          value: "mongodb://admin:$(ROCKETCHAT_MDB_PASSWORD)@rocketchat-mdb-0.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-1.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-2.rocketchat-mdb-svc.default.svc.cluster.local:27017/?authSource=admin&replicaSet=rocketchat-mdb"
        - name: MONGO_OPLOG_URL
          value: "mongodb://admin:$(ROCKETCHAT_MDB_PASSWORD)@rocketchat-mdb-0.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-1.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-2.rocketchat-mdb-svc.default.svc.cluster.local:27017/local?authSource=admin&replicaSet=rocketchat-mdb"
        ports:
          - name: http
            containerPort: 3000

I spent around 3-4 hours and still couldn't run it.

In the Kubernetes world right now, documentation in general is really, really, really awful. If ever single issue I had with Kubernetes lasted only 3-4 hours, I would honestly be very happy with Kubernetes. Set your expectations to spend dozens of hours, days, or maybe even weeks trying to fix a single specific non-trivial issue you want to fix. Debugging will take you a long time because no one has your custom setup, or things are so new that it is impossible to find help online. Moving from v0.1.1 to v0.2.1 and connecting it took a full day. In the case of lacking documentation, it's very hard to figure out why things are breaking so you have to start systematically iterating through all the different combinations of YAML files until things become clearer. For example, I did not know why I could not connect to the first replica set member using the above credentials. After a few hours, I took a break and came back, and the credentials worked because I waited long enough for the 3 replica set members to come up and have the operator pod apply.

theburi commented 4 years ago

The community Operator is about to reach "Beta" state and I would expect it to evolve even more until it gets to a stable state. If you use MongoDB Community, this project should let you run it in Kubernetes. It is still very much a WIP.

Enterprise Operator is targeting Enterprise customers and is much far along and has many more features. So if you have MongoDB Entperise Advanced license- Enterprise Operator is included as well as OpsManager that goes beyond simple installation: MongoDB Monitoring, profiling, query analytics, Index analysts, LDAP, At Rest encryption as well as complex configuration and cluster changes. Enterprise Backup including Point in Time Restore. Enterprise Operator is also fully supported by MongoDB Support Team.

deyanp commented 4 years ago

@theburi , any idea what is the rough pricing for the Enterprise Operator? I am currently using MongoDB Atlas M30, and wondering if I set it all by myself and pay for the Enterprise Operator if I am going to come to the same or even higher costs ...

maxisam commented 3 years ago

I wonder why there is no sharding in community edition. I hope that is in the plan

jamesbroadhead commented 3 years ago

@deyanp please contact our Sales team for Enterprise pricing - https://www.mongodb.com/pricing

205g0 commented 3 years ago

@hi-usui I've just seen your post about backup strategies, very nice and thx for sharing! In this context, are you using the mongo-k8s-operator in production and would you say that it's production-ready (for running replica sets) compared to let's say the Bitnami Helm Mongo chart?

bmv-ce commented 3 years ago

Just my two cents: we tried Bitnami Helm Mongo, but it doesn't look production-ready at all, as it cannot survive "primary" pod restarts (see here: https://github.com/bitnami/charts/issues/6741)

We are looking to see if this Community Operator will work for us better.

mrgoonie commented 3 years ago

Agree with @bmv-ce, I can't remember how many times I fixed the MongoDB deployment of Bitnami repo, it's totally not production ready at all.

kwill4026 commented 2 years ago

i second that

jamesbroadhead commented 2 years ago

We hope that you're successful in prod - whether that's with this Operator or with Helm Charts.

That said, this Issue tracker is intended to help us track work that the team should take on.

If you'd like to discuss Opeartors or Charts, I recommend the Community Forum - https://www.mongodb.com/community/forums/tag/kubernetes-operator

mongodb / mongodb-kubernetes-operator

Enterprise vs Community Operator? #198