fabric8io / kubernetes-client

Java client for Kubernetes & OpenShift
http://fabric8.io
Apache License 2.0
3.41k stars 1.46k forks source link

Stopping sharedInformers takes 5 seconds #3024

Closed akram closed 3 years ago

akram commented 3 years ago

While stopping registered informers, there is always a fixed 5 seconds time per informer.

Here a gist for a reproducer: https://gist.github.com/akram/bbe5eb8793db580b82d8bfc320ecc7e6

Logs does not reveal where is it coming from. Between these 2 lines there is always 5 seconds

2021-04-21 10:38:44:964 INFO io.fabric8.kubernetes.examples.SharedInformerExample - Stopping informers at Wed Apr 21 10:38:44 CEST 2021

and

2021-04-21 10:38:49:973 INFO io.fabric8.kubernetes.examples.SharedInformerExample - Stopped informers at Wed Apr 21 10:38:49 CEST 2021

Please note that an additional 5 seconds await is done and logged after stopping the informer when the serviceExecutor is shutdown.

2021-04-21 10:38:37:265 INFO io.fabric8.kubernetes.examples.SharedInformerExample - jenkins-1-deploy pod added
2021-04-21 10:38:44:964 INFO io.fabric8.kubernetes.examples.SharedInformerExample - Stopping informers at Wed Apr 21 10:38:44 CEST 2021
2021-04-21 10:38:44:966 DEBUG io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager - Force closing the watch io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@714ceaf5
2021-04-21 10:38:44:967 INFO io.fabric8.kubernetes.client.informers.cache.ReflectorWatcher - Watch gracefully closed
2021-04-21 10:38:44:967 DEBUG io.fabric8.kubernetes.client.informers.cache.Reflector - Starting watcher for resource class io.fabric8.kubernetes.api.model.Pod v10903793
2021-04-21 10:38:44:968 DEBUG io.fabric8.kubernetes.client.informers.cache.Reflector - Stopping previous watcher
2021-04-21 10:38:44:968 DEBUG io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager - Force closing the watch io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@714ceaf5
2021-04-21 10:38:44:968 DEBUG io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager - Ignoring duplicate firing of onClose event
2021-04-21 10:38:44:969 DEBUG io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager - Closing websocket okhttp3.internal.ws.RealWebSocket@134ebb2c
2021-04-21 10:38:44:970 DEBUG io.fabric8.kubernetes.client.dsl.internal.AbstractWatchManager - Closing ExecutorService
2021-04-21 10:38:44:971 DEBUG io.fabric8.kubernetes.client.informers.cache.Reflector - Watcher already started, delaying execution of new watcher
2021-04-21 10:38:45:083 DEBUG io.fabric8.kubernetes.client.dsl.internal.WatcherWebSocketListener - Socket closing:
2021-04-21 10:38:45:084 DEBUG io.fabric8.kubernetes.client.dsl.internal.WatcherWebSocketListener - WebSocket close received. code: 1000, reason:
2021-04-21 10:38:45:084 DEBUG io.fabric8.kubernetes.client.dsl.internal.WatcherWebSocketListener - Ignoring onClose for already closed/closing websocket
2021-04-21 10:38:49:973 INFO io.fabric8.kubernetes.examples.SharedInformerExample - Stopped informers at Wed Apr 21 10:38:49 CEST 2021
2021-04-21 10:38:49:973 WARN io.fabric8.kubernetes.client.informers.cache.ProcessorListener - Processor thread interrupted: null
2021-04-21 10:38:49:973 WARN io.fabric8.kubernetes.client.informers.cache.Controller - DefaultController#processLoop got interrupted null
2021-04-21 10:38:49:976 INFO io.fabric8.kubernetes.client.utils.Utils - shuting down executor
2021-04-21 10:38:49:977 INFO io.fabric8.kubernetes.client.utils.Utils - waiting for clean termination 5 sec

Is there a way to make this delay shorter? is it related to okhttp or is it something done in kubernetes-client?

akram commented 3 years ago
image

The onClose method for some reason calls io.fabric8.kubernetes.client.informers.cache.Reflector.startWatcher(). This method has a sleep time of 5 seconds.

this.watcher = new ReflectorWatcher<>(store, lastSyncResourceVersion, this::startWatcher, this::reListAndSync);

There is a problem as, the 3rd argument of the constructor is supposed to be an onClose method ; but here we are passing a startWatcher , that seems to be a confusion of the API.

I am doing some tests to pass another method to see if that's a descent fix.