Closed sunchill06 closed 2 years ago
@fabritsius, I would like to draw your attention to this issue. This seems to be a serious problem. We observed one more problem recently:
A cluster was autoscaled from M10 -> M20. The Atlas Operator tried to reconcile this and failed with the similar error:
{"level":"INFO","time":"2022-07-19T09:03:30.344Z","msg":"-> Starting AtlasDeployment reconciliation","atlasdeployment":"platform/projects-projects-rs6","spec":{"projectRef":{"name":"projects-xxxxxxx","namespace":""},"advancedDeploymentSpec":{"backupEnabled":true,"diskSizeGB":10,"mongoDBMajorVersion":"4.4","name":"projects-rs6","pitEnabled":false,"replicationSpecs":[{"numShards":1,"regionConfigs":[{"electableSpecs":{"instanceSize":"M10"},"autoScaling":{"diskGBEnabled":true,"compute":{"enabled":true,"scaleDownEnabled":true,"minInstanceSize":"M10","maxInstanceSize":"M40"}},"providerName":"GCP","regionName":"WESTERN_EUROPE"}]}]},"backupRef":{"name":"","namespace":""}},"status":{"conditions":[{"type":"Ready","status":"False","lastTransitionTime":"2022-07-19T08:27:25Z"},{"type":"ValidationSucceeded","status":"True","lastTransitionTime":"2022-07-14T16:26:27Z"},{"type":"DeploymentReady","status":"False","lastTransitionTime":"2022-07-19T08:27:25Z","reason":"DeploymentNotUpdatedInAtlas","message":"PATCH https://cloud.mongodb.com/api/atlas/v1.5/groups/xxxxxxxxx/clusters/projects-rs6: 400 (request \"ATTRIBUTE_READ_ONLY\") The attribute createDate is read-only and cannot be changed by the user."}],"observedGeneration":1,"stateName":"IDLE","mongoDBVersion":"4.4.15","connectionStrings":{"standard":"mongodb://projects-rs6-xxxxxxx.mongodb.net:27017,projects-rs6-xxxxxxx:27017,projects-rs6-xxxxxxx:27017/?ssl=true&authSource=admin&replicaSet=atlas-vavo0p-shard-0","standardSrv":"mongodb+srv://projects-rs6.xxxxxxx"}}}
{"level":"INFO","time":"2022-07-19T09:03:30.344Z","msg":"Reading Atlas API credentials from the AtlasProject Secret platform/projects-xxxxxx-api-key","atlasdeployment":"platform/projects-projects-rs6"}
{"level":"INFO","time":"2022-07-19T09:03:30.842Z","msg":"Status update","atlasdeployment":"platform/projects-projects-rs6","lastCondition":{"type":"DeploymentReady","status":"False","lastTransitionTime":null,"reason":"DeploymentNotUpdatedInAtlas","message":"PATCH https://cloud.mongodb.com/api/atlas/v1.5/groups/xxxxxxx/clusters/projects-rs6: 400 (request \"ATTRIBUTE_READ_ONLY\") The attribute createDate is read-only and cannot be changed by the user."}}
Concerns:
createDate
as part of the Deployment Spec? If yes, then what about when we are creating a new cluster?Any efforts to prioritise this would be highly appreciated as this seems like a blocker for us. I would be happy to share any further details if required.
Hey @sunchill06,
Thanks for reporting thisπ Looks like a bug, will fix it ASAP.
I've just merged the PR #615 which should solve this issue. We are planning a release this week. Feel free to reopen this if the issue persists. Thanks π
Thanks @fabritsius. This would be really appreciated as we are unable to progress the rollout because of this issue.
Hey @fabritsius Thanks a lot for your work on this issue! π
One thing about the first question that @sunchill06 asked before:
Should an auto-scaling event be reconciled by the operator? As there is nothing changed in the Deployment Spec from our side.
Please Correct me if I'm wrong but I think we still need some logic to unset disk size and instance size on patch requests to a cluster with compute/disk autoscaling!
As far as I understand this PR https://github.com/mongodb/mongodb-atlas-kubernetes/pull/615 only fixes the createDate problem.
Hey @fabritsius, when can we expect the release containing this fix please?
What did you do to encounter the bug? We had existing atlas clusters created/managed by Atlas operator Alpha version. While upgrading to the operator's GA version, we changed the k8s manifest w.r.t
advancedDeploymentSpec
of theAtlasDeployment
CR spec. This new spec had instance size changed from M10 to M40 (which was within the autoscaling limits of the existing atlas cluster).Previous Spec:
New Spec:
This resulted in a confusing error:
We got to know about this when we ran the operator in debug mode and saw this in the logs:
What did you expect?
AtlasDeployment
should have been created successfully.Operator Information
kubectl describe output