Open dpadhiar opened 1 week ago
Sorry, @afugazzotto - I know you probably had this working perfectly before. Wondering if you can look.
I'm guessing this probably got broken when I removed the calls to:
removeNullValuesFromMap(onlyPaths)
removeNullValuesFromMap(withoutPaths)
in this PR.
We need to figure out the best way to handle this.
Background: The reason I removed those calls is that we cannot remove all empty maps from a given spec because some of them are meaningful, e.g. if we take this Kafka field (which is a Pointer), then:
sink:
kafka: {}
a: b
is meaningfully different from:
sink:
a: b
and Derek updated numaflow such that it no longer writes zero values.
(more information about the analysis that caused this change here)
Actually, I reassigned it to myself to look at first. I have a feeling it may be related to how we’re testing vs an actual bug - need to check that
Okay, I was able to reproduce this just now by doing what Dillen was originally doing to cause this I believe. I took his PR and updated the original isbsvc Rollout spec to comment out these lines:
isbServiceSpec = numaflowv1.InterStepBufferServiceSpec{
Redis: nil,
JetStream: &numaflowv1.JetStreamBufferService{
Version: "2.9.6",
Persistence: &numaflowv1.PersistenceStrategy{
VolumeSize: &volSize,
},
/*ContainerTemplate: &numaflowv1.ContainerTemplate{
Resources: v1.ResourceRequirements{
Limits: v1.ResourceList{v1.ResourceMemory: volSize},
},
},*/
},
}
The changed version is as it is in the PR:
ISBServiceSpecExcludedField = numaflowv1.InterStepBufferServiceSpec{
Redis: nil,
JetStream: &numaflowv1.JetStreamBufferService{
Version: "2.9.8",
Persistence: &numaflowv1.PersistenceStrategy{
VolumeSize: &volSize,
},
ContainerTemplate: &numaflowv1.ContainerTemplate{
Resources: v1.ResourceRequirements{
Limits: v1.ResourceList{v1.ResourceMemory: memLimit},
},
},
},
}
The excluded field is: "jetstream.containerTemplate.resources.limits"
The excerpt from log is this:
...."newSpecWithoutApplyPaths":{"jetstream":{"containerTemplate":{"resources":{}},"persistence":{"volumeSize":"10Mi"},"version":"2.9.8"}}
....."existingSpecWithoutApplyPaths":{"jetstream":{"persistence":{"volumeSize":"10Mi"},"version":"2.9.6"}}
{"level":"debug","isbservicerollout":{"Namespace":"numaplane-system","Name":"test-isbservice-rollout"},"userUpgradeStrategy":"pause-and-drain","newSpecWithoutApplyPaths":{"jetstream":{"containerTemplate":{"resources":{}},"persistence":{"volumeSize":"10Mi"},"version":"2.9.8"}},"existingSpecWithoutApplyPaths":{"jetstream":{"persistence":{"volumeSize":"10Mi"},"version":"2.9.6"}},"logger":"numaplane.controller-manager.isbservicerollout-reconciler","caller":"usde/usde.go:70","ts":"2024-10-08T21:04:08.714486525Z","msg":"the specs without the 'apply' paths are different"}
So, the issue is that the parent fields of our excluded field are still being included in the comparison.
Here are my thoughts on this: https://docs.google.com/document/d/1nbJ9qjK49JgOsiOxKqS4Foi_ph3JQCmbFi3FZTHDqeQ/edit
Hey @afugazzotto - we don't necessarily need to do this in Q1 but maybe you can take a look at this in early Q2 - I wrote up a document above. Let me know if it makes sense (fine to look at it later).
@juliev0 I'll take a look and get back to you.
@juliev0 after running some tests, I see that the comparison will always result in a difference in case a path is removed or not previously present since now we do not "cleanup" the object by removing null values (objects, arrays, etc.) previously done by removeNullValuesFromMap
func. I added a few comments on your doc and I think option 2 may be the only possible one at least for the time being. I'll try to think of other possible options. We should further discuss and brainstorm.
Describe the bug
If an excludedPath or path to a field does not currently exist in the original spec of a Rollout object, the USDE method
ResourceNeedsUpdating
will not return the strategyDirectApply
like is expected and instead will returnPauseAndDrain
.Note that excluding the parent field of a path does currently work.
To Reproduce Steps to reproduce the behavior:
upgrade decision result
log statement.Expected behavior
We expect the Pipeline to not pause and the resulting upgrade strategy to be DirectApply.
Message from the maintainers:
Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.