Open bharathkkb opened 2 years ago
/cc @karlkfi if you have any ideas or my yaml is misconfigured
The behavior Scenario 1 seems expected. Without the depends-on, kpt doesn't know the LoggingLogSink depends on the PubSubTopic, and the LoggingLogSink won't become reconciled until after the PubSubTopic is applied and reconciled. So kpt applies the Project and LoggingLogSink and waits forever for the LoggingLogSink to reconcile, which it won't, because it hasn't been applied.
You can make it time out with the --reconcile-timeout flag, but by default it waits forever.
The behavior of Scenario 2 seems to imply that Project, PubSubTopic, and LoggingLogSink only wait for their dependencies to exist in KRM and not GCP before being recognized as reconciled. But I'll need to try it to know for sure. Or you can paste the object YAML with their full status at the end of the kpt live apply
using something like kubectl get -f ./ -o yaml
.
I'm guessing that kstatus is recognizing KCC dependency errors as not-reconciled, but isn't recognizing whatever the in-progress condition is as not-reconciled. There's a number of "standard" ways kstatus recognizes reconciliation, and KCC may not have implemented them consistently. At least, that's my guess for now. It could also be a bug, but I'd like to rule out occam's razor first.
FWIW, if you remove all the depends-on, it should work fine too. It would be similar to Scenario 2, except all in one apply & reconcile phase.
But if the KCC objects are being detected as reconciled when they're not in GCP yet, we should probably fix that, either in KCC or kstatus.
Thank for looking into this @karlkfi!
The behavior Scenario 1 seems expected. Without the depends-on, kpt doesn't know the LoggingLogSink depends on the PubSubTopic, and the LoggingLogSink won't become reconciled until after the PubSubTopic is applied and reconciled. So kpt applies the Project and LoggingLogSink and waits forever for the LoggingLogSink to reconcile, which it won't, because it hasn't been applied.
Wouldn't it be better to proceed to applying PubSubTopic
as soon as Project
is resolved as LoggingLogSink
is not a dependency?
The behavior of Scenario 2 seems to imply that Project, PubSubTopic, and LoggingLogSink only wait for their dependencies to exist in KRM
Right, and whats odd is live status does seem to pick up the UpdateFailed status. I will try to grab some logs tomorrow.
FWIW, if you remove all the depends-on, it should work fine too. It would be similar to Scenario 2, except all in one apply & reconcile phase.
Yeah this is what we currently have but hoping to leverage depends on to reduce some of the log verbosity for automation when we have alot of resources.
Wouldn't it be better to proceed to applying PubSubTopic as soon as Project is resolved as LoggingLogSink is not a dependency?
Yes. It’s been on my TODO list to rewrite the task scheduler to use an asynchronous dependency graph, but for now the implication is a graph flattened into phases. So if a reconcile phase doesn’t succeed, it doesn’t continue.
If you put a reconcile timeout on it, it might actually succeed, but the task scheduler doesn’t know that.
Expected behavior
live apply proceeds after explicit depends-on target is reconciled
Actual behavior
live apply stalls after explicit depends-on target is reconciled
Information
Scenario 1
PubSubTopic
resource has a dependency onProject
via a depends-on annotation.LoggingLogSink
is dependent onPubSubTopic
resource viapubSubTopicRef
in KCC but has no depends-on annotation.When we do a kpt live apply this is the output
It seems to be stuck after project is reconciled. Running live status in another window.
If I exit and reapply it still seems to be stuck.
My suspicion is it is waiting for
logginglogsink
to reconcile since it applied it in the initial apply phase. However aslogginglogsink
haspubSubTopicRef
to the pubsub resource it is unable to make progress. If this is the case, I believe kpt should just be looking at just explicit deps when trying to decide next apply phase.Scenario 2
If I add an explicit depends-on to
LoggingLogSink
resource to wait forPubSubTopic
resource and start fresh apply, it seems to complete immediately without waiting for resources to reconcile although it reports it had reconciled. Note - this was another brand new project not a continuation of scenario 1.Running a live status right after this shows project is still reconciling and other are erroring as project is not created yet.
Eventually everything reconciles while behaving as if everything were applied at once without depends-on.
Kpt Version: 1.0.0-beta.17 Kpt Package that can demonstrate the error: https://github.com/bharathkkb/kpt-live-depends-issue