Closed yzdann closed 2 years ago
Related old issue for regularDeployment: https://github.com/mongodb/mongodb-atlas-kubernetes/issues/317
Thank you @yzdann for pointing this out 🙏
Personally I'm not a fan of such filtering in the operator as it removes the option to set instanceSize
or diskGB
when you actually want to change them. It would be better if the Atlas API server did such logic. Keep in mind that for updates the operator sends PATCH
requests so you can always omit the fields you do not want to be changed.
This being said in my view your request makes sense in lots of scenarios. Now I'm thinking of adding such filtering under a special annotation and maybe even make filtering the default. In that case same logic will be applied to the regular deployments to make everything unified.
Have to discuss with the team our options and will get back to you shortly.
Hey @yzdann
@fabritsius and I have discussed this at length.
We're going to make a change to make the following true:
instanceSize
and diskGB
will be respected and usedinstanceSize
and diskGB
will be disregarded by our operator and not sent to Atlas.We're going to hold off on any kind of special annotation (to override the autoscaling) for now. Any scenario where you want to forcibly change the instanceSize
and diskGB
are likely as a result of Atlas not autoscaling as you wanted, or as quickly as you'd like. In either case, any product changes we make should be to autoscaling code in Atlas rather than the Operator - forcing a change is a short-term fix and non-ideal workaround.
A change could still be forced by disabling autoscaling - in which case instanceSize
and diskGB
would be honoured.
We're planning to get this change into a release some time next week.
@yzdann apologies on the delay on this. Some stuff got juggled around in the team and this is just landing now 🙂
We're expecting to do a release with this included this week.
Hi @dan-mckean @fabritsius @igor-karpukhin,
I was testing the latest Operator release in our playground env and have a doubt that I would like to clear:
As per this PR and the comments mentioned in it, it appears that If autoscaling is enabled for subsequent spec changes, instanceSize and diskGB will be disregarded by our operator and not sent to Atlas.
I created an AtlasDeployment
using advancedDeploymentSpec
as below
apiVersion: atlas.mongodb.com/v1
kind: AtlasDeployment
metadata:
name: advanced-cluster
spec:
projectRef:
name: playground-project
advancedDeploymentSpec:
backupEnabled: true
clusterType: SHARDED
diskSizeGB: 40
mongoDBMajorVersion: "4.4"
name: advanced-cluster
pitEnabled: false
replicationSpecs:
- numShards: 1
regionConfigs:
- autoScaling:
compute:
enabled: true
maxInstanceSize: M40
minInstanceSize: M30
scaleDownEnabled: true
diskGB:
enabled: true
electableSpecs:
instanceSize: M30
nodeCount: 3
priority: 7
providerName: GCP
regionName: WESTERN_EUROPE
Then I edited the AtlasDeployment
and updated the spec
to change the instanceSize
to M40
and I got this error (BASE_INSTANCE_SIZE_MUST_MATCH
) in operator logs
{"level":"INFO","time":"2022-11-18T14:11:34.950Z","msg":"Status update","atlasdeployment":"platform/advanced-cluster","lastCondition":{"type":"DeploymentReady","status":"False","lastTransitionTime":null,"reason":"DeploymentNotUpdatedInAtlas","message":"PATCH https://cloud.mongodb.com/api/atlas/v1.5/groups/<redacted-group-id>/clusters/advanced-cluster: 400 (request \"BASE_INSTANCE_SIZE_MUST_MATCH\") Electable and read-only nodes must all have the same instance size."}}
I also expected this to fail but not with this error message, I expected that this patch request would not be sent to Atlas (as per this comment)
But when I edited the AtlasDeployment
and updated the spec
to include readOnlySpecs
as well, the patch request was sent to Atlas and was successful as well and the Deployment's instanceSize got changed at Atlas.
Spec with readOnlySpecs
spec:
advancedDeploymentSpec:
backupEnabled: true
clusterType: SHARDED
diskSizeGB: 40
mongoDBMajorVersion: "4.4"
name: advanced-cluster
pitEnabled: false
replicationSpecs:
- numShards: 1
regionConfigs:
- autoScaling:
compute:
enabled: true
maxInstanceSize: M40
minInstanceSize: M30
scaleDownEnabled: true
diskGB:
enabled: true
electableSpecs:
instanceSize: M40
nodeCount: 3
priority: 7
providerName: GCP
readOnlySpecs:
instanceSize: M40
nodeCount: 3
regionName: WESTERN_EUROPE
Operator logs
{"level":"INFO","time":"2022-11-18T14:18:48.858Z","msg":"Status update","atlasdeployment":"platform/advanced-cluster","lastCondition":{"type":"DeploymentReady","status":"False","lastTransitionTime":null,"reason":"DeploymentUpdating","message":"deployment is updating"}}
❓ 😕 Now this PR and the other PR i.e. https://github.com/mongodb/mongodb-atlas-kubernetes/issues/649 are kind of contradictory where one seems to allow changing instanceSize
and other doesn't in case AutoScaling is enabled.
Can you please clarify what is happening here and if my understanding is not correct?
Hi, @sunchill06. This PR has been merged 23 days ago: https://github.com/mongodb/mongodb-atlas-kubernetes/pull/715 and it allows changing instanceSize
within the minInstanceSize
- maxInstanceSize
range. This change follows Atlas logic, which allows such behavior.
Thanks for the update @igor-karpukhin Few follow-up questions if you don't mind answering:
readOnlySpecs
if they were not part of the original spec during creation of the deployment.instanceSize
can be changed via operator for advancedDeploymentSpec as well?Sorry about so many queries but its really important for us to clarify this and understand what behaviour to expect from the Operator now.
Hi @sunchill06, sorry for the late reply. Let me answer your questions:
readOnlySpecs
if you don't need them. This field is still optional. But, I noticed if I try to update and instanceSize
within the min/max range, without specifying readOnlySpecs
, it will fail. This happens, because Atlas API also returns readOnlySpecs
(you can try to create a cluster only specifying electableSpecs
, and use the Atlas API to get the cluster, you will see all types of specs in the output, even if they have 0 nodes). But, I guess, we can handle that better from the operator side.instanceSize
can be changed by users, not the operator. If you change it within the min/max range - it is allowed by the Atlas.Thanks for your revert @igor-karpukhin
So I believe you would fix the first issue (need to specify readOnlySpecs
) with https://github.com/mongodb/mongodb-atlas-kubernetes/issues/777 ??
Also, when you say instanceSize
can be changed by users? Do you mean operator won't reconcile the instanceSize
due to any autoscaling event at Atlas, by when the AtlasDeployment
CR spec is manually updated, the operator will reconcile that (considering its within the autoscaling min/max size range) ??
Hi @sunchill06. Sorry for the late response, it looks like I missed the notification.
The problem with readOnlySpecs
(and any specs
) is resolved by this PR: https://github.com/mongodb/mongodb-atlas-kubernetes/pull/782.
Regarding the instance size. If you set it and enable autoscaling, it wont' be changed if CR is updated, but instanceSize
has not been touched (https://github.com/mongodb/mongodb-atlas-kubernetes/blob/25d543bf7ef58a257f0d5e5902c8a681c7d7f73c/pkg/controller/atlasdeployment/advanced_deployment.go#L154)
What did you do to encounter the bug? Steps to reproduce the behavior:
What did you expect? With version 1.2.0 (and older) of the operator with
regularDeployment
you cannot change instanceSize/diskGB, if you enabled autoscaling for compute/disk which is a way to prevent overwrite changes of Atlas autoscaler but inadvancedDeployment
operator will allow you to change the instanceSize.Regular Deployment logic https://github.com/mongodb/mongodb-atlas-kubernetes/blob/73c24914089f04bf7ab932f31ba2ed2b439280c4/pkg/controller/atlasdeployment/deployment.go#L118
https://github.com/mongodb/mongodb-atlas-kubernetes/blob/73c24914089f04bf7ab932f31ba2ed2b439280c4/pkg/controller/atlasdeployment/deployment.go#L123-L151
What happened instead? For example, we have an
advancedDeployment
with instanceSizeM30
, Atlas autoscaler because of some load increased the instanceSize to M40. After that we decide to change sth, for example, disable JS and we trigger reconciliation now we are not only changing JS but also changing back the instanceSize to M30 !Screenshots If applicable, add screenshots to help explain your problem.
Operator Information