Closed kmanev073 closed 3 years ago
Yes, Enterprise documentation is extremely vague and confusing, probably by design. I assume that MongoDB Inc. is using this project to hook in more customers for the Enterprise operator.
While this does not fully answer the question, I also had the same question which was partially answered here: https://github.com/mongodb/mongodb-kubernetes-operator/issues/145#issuecomment-676489060
As for whether this is production-ready or not, it seems it is getting closer, since this project now requires anyone who installs the latest v0.2.x release to have authentication enabled, without any option to disable it. I depend on this operator in a production system, and I never plan on using the Enterprise operator. This is also the only actual option for distributed MongoDB on Kubernetes without the Enterprise operator.
There have been some great advancements within the past few months and the maintainers seem to be working hard on adding more features. For example, previously you could not specify the size of your persistentVolumeClaim which was a huge oversight (10Gi default per volume? Ouch), but now you can. I'm sure more usability features are on the way.
Yes, Enterprise documentation is extremely vague and confusing, probably by design. I assume that MongoDB Inc. is using this project to hook in more customers for the Enterprise operator.
While this does not fully answer the question, I also had the same question which was partially answered here: #145 (comment)
As for whether this is production-ready or not, it seems it is getting closer, since this project now requires anyone who installs the latest v0.2.x release to have authentication enabled, without any option to disable it. I depend on this operator in a production system, and I never plan on using the Enterprise operator. This is also the only actual option for distributed MongoDB on Kubernetes without the Enterprise operator.
There have been some great advancements within the past few months and the maintainers seem to be working hard on adding more features. For example, previously you could not specify the size of your persistentVolumeClaim which was a huge oversight (10Gi default per volume? Ouch), but now you can. I'm sure more usability features are on the way.
Cool and thanks! If you rely on this operator for production, how do you do backups and etc? Is there any good documentation because I spent around 3-4 hours and still couldn't run it.
Yes, Enterprise documentation is extremely vague and confusing, probably by design. I assume that MongoDB Inc. is using this project to hook in more customers for the Enterprise operator. While this does not fully answer the question, I also had the same question which was partially answered here: #145 (comment) As for whether this is production-ready or not, it seems it is getting closer, since this project now requires anyone who installs the latest v0.2.x release to have authentication enabled, without any option to disable it. I depend on this operator in a production system, and I never plan on using the Enterprise operator. This is also the only actual option for distributed MongoDB on Kubernetes without the Enterprise operator. There have been some great advancements within the past few months and the maintainers seem to be working hard on adding more features. For example, previously you could not specify the size of your persistentVolumeClaim which was a huge oversight (10Gi default per volume? Ouch), but now you can. I'm sure more usability features are on the way.
Cool and thanks! If you rely on this operator for production, how do you do backups and etc? Is there any good documentation because I spent around 3-4 hours and still couldn't run it.
Sure thing. What is your intended setup?
For backups, I use:
kubectl -n rook-ceph exec ${rookCephToolsPod} -- bash -c "rm -f /snapshot.img && rbd export replicapool/csi-snap-${snapshotHandle} /snapshot.img
and uploads it as a tgz to AWS Glacier S3 for cheap long-term storagesel=app=rook-ceph-tools; ns=rook-ceph; k -n $ns cp snapshot.img $(kubectl get pod -n $ns -l $sel --sort-by=.metadata.creationTimestamp | sed -n 2p | awk '{print $1}'):/snapshot.img; k -n $ns exec -it $(kubectl get pod -n $ns -l $sel --sort-by=.metadata.creationTimestamp | sed -n 2p | awk '{print $1}') -- rbd import /snapshot.img replicapool/csi-snap-1be36668-e5f7-11ea-8658-d2f514220151
where csi-snap-xxxx represents the handle connected to the volumeSnapshotContent
Although I am not well-read about the different failure modes of volume snapshots, so far, every snapshot I have taken has been able to restore fine and stateful applications have no issue continuing from snapshots. I do not think anyone wants to make custom scripts for mongodump and mongorestore. You don't have to worry about how applications want you to backup/restore things, unless something critical stays in-memory for some time, in which case disk storage will never be able to help you anyway.
Is there any good documentation
I think the maintainers are currently working very hard (their prompt response time on issues is greatly appreciated) on trying to add MongoDB features and fix bugs, so understandably, documentation comes secondary until they catch up.
Here is a demo YAML file for a ReplicaSet using Rocket Chat, which is easy to set up and great for testing this MongoDB operator.
kubectl create secret generic k8s-secrets --from-env-file "$SECRETS_KUBERNETES_PATH"
This creates a secret with key-value pairs from a text file such as:
ROCKETCHAT_MDB_PASSWORD=bklahsdkjaskjdhasdasd
...
...
SENDGRID_SMTP_USERNAME=1i91789261tg
Note: From what I observe, authentication credentials do not work until all replica set members are in the Ready state and the mongodb-kubernetes-operator applies to all 3 members at once. Also, currently you need to use "data-volume" for metadata.name for it to replace the default one. https://github.com/mongodb/mongodb-kubernetes-operator/issues/177 claims that it is fixed, but whenever I tried to use any name other than data-volume, 2 volumes per replica set member were still created. The ROOT_URL of https://usuih.com/chat is just the base route ingress-nginx controller forwards to (service port 11001). It will be different for anyone who wants to use this example.
apiVersion: mongodb.com/v1
kind: MongoDB
metadata:
name: rocketchat-mdb
spec:
members: 3
type: ReplicaSet
version: 4.2.9
persistent: true
security:
authentication:
modes: ["SCRAM"]
users:
- name: admin
db: admin
passwordSecretRef:
name: k8s-secrets
key: ROCKETCHAT_MDB_PASSWORD
roles:
- name: root
db: local
- name: readWrite
db: local
- name: root
db: test
- name: readWrite
db: test
statefulSet:
spec:
volumeClaimTemplates:
- metadata:
name: data-volume
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Service
metadata:
name: rocketchat-srv-cluster-ip
labels:
component: rocketchat-srv-cluster-ip
spec:
type: ClusterIP
selector:
component: rocketchat-srv
ports:
- name: rocketchat-srv-http
port: 11001
targetPort: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rocketchat-srv
spec:
replicas: 1
selector:
matchLabels:
component: rocketchat-srv
template:
metadata:
labels:
component: rocketchat-srv
spec:
initContainers:
- name: rocketchat-mdb
image: mongo:4.2.9
env:
- name: ROCKETCHAT_MDB_PASSWORD
valueFrom:
secretKeyRef:
name: k8s-secrets
key: ROCKETCHAT_MDB_PASSWORD
command: ['sh', '-c', 'until mongo mongodb://admin:$(ROCKETCHAT_MDB_PASSWORD)@rocketchat-mdb-0.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-1.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-2.rocketchat-mdb-svc.default.svc.cluster.local:27017/?authSource=admin\&replicaSet=rocketchat-mdb --eval "print(\"successful connection\")"; do sleep 1; done;']
containers:
- name: rocketchat-srv
image: rocketchat/rocket.chat:3.6.1
env:
- name: ROCKETCHAT_MDB_PASSWORD
valueFrom:
secretKeyRef:
name: k8s-secrets
key: ROCKETCHAT_MDB_PASSWORD
- name: ROOT_URL
value: "https://usuih.com/chat"
- name: MONGO_URL
value: "mongodb://admin:$(ROCKETCHAT_MDB_PASSWORD)@rocketchat-mdb-0.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-1.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-2.rocketchat-mdb-svc.default.svc.cluster.local:27017/?authSource=admin&replicaSet=rocketchat-mdb"
- name: MONGO_OPLOG_URL
value: "mongodb://admin:$(ROCKETCHAT_MDB_PASSWORD)@rocketchat-mdb-0.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-1.rocketchat-mdb-svc.default.svc.cluster.local:27017,rocketchat-mdb-2.rocketchat-mdb-svc.default.svc.cluster.local:27017/local?authSource=admin&replicaSet=rocketchat-mdb"
ports:
- name: http
containerPort: 3000
I spent around 3-4 hours and still couldn't run it.
In the Kubernetes world right now, documentation in general is really, really, really awful. If ever single issue I had with Kubernetes lasted only 3-4 hours, I would honestly be very happy with Kubernetes. Set your expectations to spend dozens of hours, days, or maybe even weeks trying to fix a single specific non-trivial issue you want to fix. Debugging will take you a long time because no one has your custom setup, or things are so new that it is impossible to find help online. Moving from v0.1.1 to v0.2.1 and connecting it took a full day. In the case of lacking documentation, it's very hard to figure out why things are breaking so you have to start systematically iterating through all the different combinations of YAML files until things become clearer. For example, I did not know why I could not connect to the first replica set member using the above credentials. After a few hours, I took a break and came back, and the credentials worked because I waited long enough for the 3 replica set members to come up and have the operator pod apply.
The community Operator is about to reach "Beta" state and I would expect it to evolve even more until it gets to a stable state. If you use MongoDB Community, this project should let you run it in Kubernetes. It is still very much a WIP.
Enterprise Operator is targeting Enterprise customers and is much far along and has many more features. So if you have MongoDB Entperise Advanced license- Enterprise Operator is included as well as OpsManager that goes beyond simple installation: MongoDB Monitoring, profiling, query analytics, Index analysts, LDAP, At Rest encryption as well as complex configuration and cluster changes. Enterprise Backup including Point in Time Restore. Enterprise Operator is also fully supported by MongoDB Support Team.
@theburi , any idea what is the rough pricing for the Enterprise Operator? I am currently using MongoDB Atlas M30, and wondering if I set it all by myself and pay for the Enterprise Operator if I am going to come to the same or even higher costs ...
I wonder why there is no sharding in community edition. I hope that is in the plan
@deyanp please contact our Sales team for Enterprise pricing - https://www.mongodb.com/pricing
@hi-usui I've just seen your post about backup strategies, very nice and thx for sharing! In this context, are you using the mongo-k8s-operator in production and would you say that it's production-ready (for running replica sets) compared to let's say the Bitnami Helm Mongo chart?
Just my two cents: we tried Bitnami Helm Mongo, but it doesn't look production-ready at all, as it cannot survive "primary" pod restarts (see here: https://github.com/bitnami/charts/issues/6741)
We are looking to see if this Community Operator will work for us better.
Agree with @bmv-ce, I can't remember how many times I fixed the MongoDB deployment of Bitnami repo, it's totally not production ready at all.
i second that
We hope that you're successful in prod - whether that's with this Operator or with Helm Charts.
That said, this Issue tracker is intended to help us track work that the team should take on.
If you'd like to discuss Opeartors or Charts, I recommend the Community Forum - https://www.mongodb.com/community/forums/tag/kubernetes-operator
I was wondering what are the differences between this operator and the enterprise one? I couldn't find any documentation on the topic. No pricing comparisons, no feature comparisons, no license comparisons... Is this community driver production ready? Clearly the enterprise should be...