mongodb / mongodb-atlas-kubernetes

MongoDB Atlas Kubernetes Operator - Manage your MongoDB Atlas clusters from Kubernetes
http://www.mongodb.com/cloud/atlas
Apache License 2.0
146 stars 75 forks source link

Provide a way to make resources not deletable by the operator so I can use it in existing projects - Feature request #265

Closed denist-huma closed 2 years ago

denist-huma commented 3 years ago

Story

As a DevOps Who is the project owner of sandbox, demo, QA projects I want Atlas Kubernetes Operator to create a cluster and a user in the existing project I don't want the Operator to delete an existing project or modify its parameters I don't want the Operator to delete existing clusters or users (optional) I want the Operator to fill in the current state of my cluster to the CRD, i.e. accommodate changes made from the command line and UI to the CRD So I can use the Operator to manage resources in an organization I don't own.

Conclusions of my test drive

The main drawbacks of the current Atlas Kubernetes Operator (for my use case at Huma) are:

It deletes the project I have declared in an AtlasProject CRD if I delete the last AtlasDatabaseUser linked to this project It deletes the project I have declared in an AtlasProject CRD if I delete the last AtlasCluster linked to this project It deletes the project I have declared in an AtlasProject CRD for the reasons unknown to me, i.e. if I modify the project before the first AtlasCluster linked to this project will be created It deletes all existing IP address that I have entered manually in my project if they are not in the projectIpAccessList of an AtlasProject CRD

Consideration

Versions

K8s 1.19 AKO 0.5.0

I tag @jasonmimick since it is a feature request, probably several features.

denist-huma commented 3 years ago

Hey, @jasonmimick I am getting in detail about how your operator works. When I disable disableAtlasProjectReconciler then AtlasProject.status.id remains unset. This lead to internal errors in AtlasDatabaseUser reconciliation all the time. :disappointed: "reason": "DatabaseUserNotCreatedInAtlas", "message": "groupID is invalid because must be set"

Update It is not possible to set this AtlasProject.status.id with kubectl, same in Helm because of kubectl edit or apply can not update .status when status subresource is enabled #564 I think that proves that to store a groupID in the status field is a bad idea. It is better to put it in spec and allow me as DevOps to change it.

2021-07-08T12:27:48.969Z        INFO    controllers.AtlasDatabaseUser   -> Starting AtlasDatabaseUser reconciliation    {"atlasdatabaseuser": "operator-sandbox/atlas-cluster-atlas-operator-cluster-admin-user", "spec": {"projectRef":{"name":"atlas-cluster-atlas-operator-project","namespace":""},"databaseName":"admin","roles":[{"roleName":"readWrite","databaseName":"pp_uk_aws_sandbox"}],"scopes":[{"name":"hu-aws-uk-sandbox-mongodb","type":"CLUSTER"}],"passwordSecretRef":{"name":"atlas-cluster-atlas-operator-cluster-admin-user"},"username":"atlas-operator-cluster-admin-user"}, "status": {"conditions":[{"type":"Ready","status":"False","lastTransitionTime":"2021-07-08T11:55:19Z"},{"type":"DatabaseUserReady","status":"False","lastTransitionTime":"2021-07-08T11:55:19Z","reason":"DatabaseUserNotCreatedInAtlas","message":"groupID is invalid because must be set"}],"observedGeneration":1}}
2021-07-08T12:27:48.969Z        INFO    controllers.AtlasDatabaseUser   Reading Atlas API credentials from the AtlasProject Secret operator-sandbox/mongodb-atlas-operator-api-key      {"atlasdatabaseuser": "operator-sandbox/atlas-cluster-atlas-operator-cluster-admin-user"}
2021-07-08T12:27:49.207Z        DEBUG   controllers.AtlasDatabaseUser   HTTP Request (GET) https://cloud.mongodb.com/api/atlas/v1.0/groups//clusters/hu-aws-uk-sandbox-mongodb [time (ms): 237, status: 404]      {"atlasdatabaseuser": "operator-sandbox/atlas-cluster-atlas-operator-cluster-admin-user"}
2021-07-08T12:27:49.211Z        INFO    controllers.AtlasDatabaseUser   Status update   {"atlasdatabaseuser": "operator-sandbox/atlas-cluster-atlas-operator-cluster-admin-user", "lastCondition": {"type":"DatabaseUserReady","status":"False","lastTransitionTime":null,"reason":"DatabaseUserNotCreatedInAtlas","message":"groupID is invalid because must be set"}}
2021-07-08T12:27:49.211Z        DEBUG   controller-runtime.manager.events       Warning {"object": {"kind":"AtlasDatabaseUser","namespace":"operator-sandbox","name":"atlas-cluster-atlas-operator-cluster-admin-user","uid":"5496efdf-9e58-41da-9477-c67172d739e7","apiVersion":"atlas.mongodb.com/v1","resourceVersion":"179442102"}, "reason": "DatabaseUserNotCreatedInAtlas", "message": "groupID is invalid because must be set"}
denist-huma commented 3 years ago

No worries! :smile: My #271 #272 solution serve me well. I was able to disableAtlasClusterReconciler: "true" the only AtlasClusterReconciler reconciler. And the AtlasProjectReconciler one cannot do me much harm since I have active clusters in the project and it cannot be deleted.

DEBUG   controllers.AtlasDatabaseUser   1 out of 1 clusters have applied database user changes  {"atlasdatabaseuser": "operator-sandbox/atlas-cluster-atlas-operator-cluster-admin-user"}
jasonmimick commented 3 years ago

Hi @denist-huma - Thanks for raising this issue. We are reviewing and will post a more detailed reply next week.

Freyert commented 3 years ago

Was assessing AKO 5.0 for use and came across this issue. Wanted to try things out. The only one I could replicate was the IPAllowlist overwrite by AtlasProject CR. Are the first three still an issue? I wasn't able to reproduce them.

❌ It deletes the project I have declared in an AtlasProject CRD if I delete the last AtlasDatabaseUser linked to this project ❌ It deletes the project I have declared in an AtlasProject CRD if I delete the last AtlasCluster linked to this project ❌ It deletes the project I have declared in an AtlasProject CRD for the reasons unknown to me, i.e. if I modify the project before the first AtlasCluster linked to this project will be created ✅ It deletes all existing IP address that I have entered manually in my project if they are not in the projectIpAccessList of an AtlasProject CRD

denist-huma commented 3 years ago

@Freyert your input is interesting. I wonder how you @Freyert got different results. :thinking: Do we test the same version? I got commit ded86271fc9bc24f5ec354e998bd1682077ae012 tag: v0.5.1

In short yes, I am still having these issues. It deletes the project declared in an AtlasProject CRD if it has no users. I saw this place in its code. :eyes:

Was assessing AKO 5.0 for use and came across this issue. Wanted to try things out. The only one I could replicate was the IPAllowlist overwrite by AtlasProject CR. Are the first three still an issue? I wasn't able to reproduce them.

Freyert commented 2 years ago

@denist-huma that's good info. I'm having a look again. We're seeing that the projects and only the projects are being sent these delete messages.

Freyert commented 2 years ago

@denist-huma I think we've resolved this issue with strange deletions.

It was noticed that the hook annotations on AtlasProject resources (if your helm chart adds them) were likely causing deletes to be issued for the AtlasProjects.

This comment from @fabritsius sorted it out for us: https://github.com/mongodb/mongodb-atlas-kubernetes/issues/335#issuecomment-987846090

Freyert commented 2 years ago

I do think the soft delete of resource should actually delete the k8s resource, but not the Atlas resource. Currently it gets stuck and you have to remove the finalizers manually for a delete to occur.

Proposed fix: https://github.com/mongodb/mongodb-atlas-kubernetes/pull/357

denist-huma commented 2 years ago

@Freyert thanks for the updates.

chatton commented 2 years ago

Hi @denist-huma the annotation you mentioned (mongodb.com/atlas-resource-policy=keep) is currently intended to be applied to any resources that you want to be preserved by the operator by the user. If you add these to the resources you're managing does this solve your problem?

denist-huma commented 2 years ago

@chatton thank you for coming back to me. I have applied the annotation you mentioned (mongodb.com/atlas-resource-policy=keep) - it was not working before. I have not checked with v0.7.0, maybe it is different now.

It would be good to have a functional test for this behavior. It's my advice if I were the maintainer.

chatton commented 2 years ago

Hi @denist-huma , it looks like that change went into release v0.6.0 so if you try the latest release things should be working for you.

We have tests for this functionality, but if you've tried a version >= v0.6.0 and it's not working there might be an issue with these tests which we will need to address.

In your initial report I see you were using v0.5.0 so there would be no support for this functionality in that version.

chatton commented 2 years ago

Closing this issue as this is resolved by release v0.6.0.

legal90 commented 2 months ago

@chatton

Hi @denist-huma the annotation you mentioned (mongodb.com/atlas-resource-policy=keep) is currently intended to be applied to any resources that you want to be preserved by the operator by the user. If you add these to the resources you're managing does this solve your problem?

Unfortunately, that issue is still there, checked in operator version 2.3.1. If AtlasProject has an annotation mongodb.com/atlas-reconciliation-policy: "skip", then this AtlasProject doesn't receive any status field, which doesn't allow to create AtlasDatabaseUser referring this project. The error is:

{"level":"INFO","time":"2024-07-12T09:57:18.639Z","msg":"Status update","atlasdatabaseuser":"my-namespace/my-dbuser","lastCondition":{"type":"DatabaseUserReady","status":"False","lastTransitionTime":null,"reason":"DatabaseUserNotCreatedInAtlas","message":"groupID is invalid because must be set"}}

josvazg commented 2 months ago

@legal90 can you elaborate what is your goal?

Some context in case it helps:

Annotation: mongodb.com/atlas-reconciliation-policy: "skip" means the Operator will skip reconciling that resource completely. In the case of the project that means anything else depending on project reconciliation will not be done. That includes updating the resource status, the project status in this case, which implies dependant resources will not work as you could observe with Database Users.

If you want to protect an Atlas resource from the Operator removing it, but you still need the Operator to reconcile it, then you can use the annotation: mongodb.com/atlas-resource-policy=keep but mongodb.com/atlas-reconciliation-policy: "skip" needs to be removed.

BTW versions after 2.x default to the Operator setting flag object-deletion-protection to true. This means CRD resources such as Projects, Deployments, Users, etc will be left in Altas, unmanaged, after the corresponding Kubernetes definition is removed. This is effectively make mongodb.com/atlas-resource-policy=keep the default behavior.

Not sure this has helped your case. Let me know.

legal90 commented 2 months ago

Hi @josvazg ,

My goal was to to use the operator to reconcile only AtlasDatabaseUser, but not AtlasProject resource. I don't want the operator to manage my Atlas Project, because I want to do that with terraform provider instead.

Is it possible to achieve that in any way?

If not - maybe it would make sense to return a better error message? "groupID is invalid because must be set" doesn't really point to the root of the issue 🤔

josvazg commented 2 months ago

As of today there is no way to only reconcile AtlasDatabaseUser resources alone, sorry.

About a better error message, we could revisit and validate the project id is missing before calling the Atlas API... but still the db user controller does not really know why. Maybe fail when the project id is not found in the project reference.