project-codeflare / multi-cluster-app-dispatcher

Holistic job manager on Kubernetes
Apache License 2.0
108 stars 63 forks source link

[resource accounting] update cache error handling and retry #420

Open asm582 opened 1 year ago

asm582 commented 1 year ago

In file https://github.com/project-codeflare/multi-cluster-app-dispatcher/blob/433d1cbadfe7807a22f79f292f8caec897def519/pkg/controller/clusterstate/cache/cache.go#L379 When updating the cache fails we log the error and sleep for a second. does it make sense?

asm582 commented 1 year ago

looking down the chain, saveState() always returns nil, so updateCache() will never error out ?

https://github.com/project-codeflare/multi-cluster-app-dispatcher/blob/433d1cbadfe7807a22f79f292f8caec897def519/pkg/controller/clusterstate/cache/cache.go#L240