microbean / microbean-kubernetes-controller

A toolset for writing Kubernetes controllers, or operators, in Java.
https://microbean.github.io/microbean-kubernetes-controller/
Apache License 2.0
20 stars 7 forks source link

Reflector#start() does not retry infinitely #13

Open ljnelson opened 5 years ago

ljnelson commented 5 years ago

I am not entirely sure but I believe that the relevant Go code ends up retrying forever:

https://github.com/kubernetes/client-go/blob/dcf16a0f3b52098c3d4c1467b6c80c3e88ff65fb/tools/cache/reflector.go#L128-L137

But Reflector#start() will bomb out if it can't list things:

https://github.com/microbean/microbean-kubernetes-controller/blob/ee05d9ece9f591bc7052a2569a6422e804be42e8/src/main/java/org/microbean/kubernetes/controller/Reflector.java#L625

That line will throw a KubernetesClientException, and that's it. I think this whole method should (internally) be trying this list-and-watch loop forever.

For more details, see also the ListAndWatch function:

https://github.com/kubernetes/client-go/blob/dcf16a0f3b52098c3d4c1467b6c80c3e88ff65fb/tools/cache/reflector.go#L165-L275

Another way to put this is that currently Reflector#start() really just models ListAndWatch, but it should be modeling Run in reflector.go as well.

Complicating matters is the fact that the fabric8 client will automatically try to reconnect watches if they fail.

ljnelson commented 5 years ago

I believe this code:

https://github.com/kubernetes/client-go/blob/dcf16a0f3b52098c3d4c1467b6c80c3e88ff65fb/tools/cache/reflector.go#L226-L274

…is actually implemented by the fabric8 client, more or less:

https://github.com/fabric8io/kubernetes-client/blob/cee8550053e671bb0b3489e37196924f4c949a3f/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/internal/WatchConnectionManager.java#L293-L333

…so the problem may simply reduce to the body of the start() method needing to run infinitely, i.e. to ensure that the listing behavior is also retried ad nauseam.

ljnelson commented 5 years ago

Ah, wait; this is already handled:

https://github.com/microbean/microbean-kubernetes-controller/blob/ee05d9ece9f591bc7052a2569a6422e804be42e8/src/main/java/org/microbean/kubernetes/controller/Reflector.java#L1031-L1057

There could be better logging here. I'll add logging and then close this issue.

ljnelson commented 5 years ago

Need to also review the Go code to see if the "full replace" operation occurs in the event of a failure-and-retry occurrence.