microbean / microbean-kubernetes-controller

A toolset for writing Kubernetes controllers, or operators, in Java.
https://microbean.github.io/microbean-kubernetes-controller/
Apache License 2.0
20 stars 7 forks source link

Spurious WebSocket.close() #20

Closed xguerin closed 1 year ago

xguerin commented 3 years ago

After prolonged executions with 0.3.0 (API version 4.11), I run into the following event pattern:

2021-02-19 23:58:59 DEBUG [tchConnectionManager:283 ] WebSocket close received. code: 1000, reason:
2021-02-19 23:58:59 DEBUG [tchConnectionManager:299 ] Submitting reconnect task to the executor
2021-02-19 23:58:59 DEBUG [tchConnectionManager:315 ] Scheduling reconnect task
2021-02-19 23:58:59 DEBUG [tchConnectionManager:387 ] Current reconnect backoff is 1000 milliseconds (T0)
2021-02-19 23:59:00 DEBUG [tchConnectionManager:111 ] Connecting websocket ... io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@3750dc19
2021-02-19 23:59:00 DEBUG [tchConnectionManager:162 ] WebSocket successfully opened
2021-02-19 23:59:00 DEBUG [tchConnectionManager:111 ] Connecting websocket ... io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@efc361ee
2021-02-19 23:59:00 DEBUG [tchConnectionManager:162 ] WebSocket successfully opened
2021-02-19 23:59:00 DEBUG [tchConnectionManager:345 ] Force closing the watch io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@3750dc19
2021-02-19 23:59:00 DEBUG [tchConnectionManager:363 ] Ignoring duplicate firing of onClose event

As a result of this, it looks like the events get resynchronized (or forwarded twice) to the listening application. This issue might be related: https://github.com/fabric8io/kubernetes-client/issues/1602. This is how it manifests itself: https://github.com/IBMStreams/OSStreams/issues/11.

I'll investigate to find the how, what and why, but I thought it would be worth to mention the issue here in case anything seems obvious.

ljnelson commented 3 years ago

For my own notes: https://github.com/fabric8io/kubernetes-client/issues/1602#issuecomment-536971603

xguerin commented 3 years ago

Some context below. No idea why we get another response after ~20 minutes or so.

Request:

Request{method=GET, url=https://kubernetes.docker.internal:6443/apis/streams.ibm.com/v1/namespaces/default/streamsprs?resourceVersion=1225180&watch=true, tags={}}

Request headers:

Origin: https://kubernetes.docker.internal:6443
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: rbVgB6L78UPmkVxA+82Ppg==
Sec-WebSocket-Version: 13

Response:

Response{protocol=http/1.1, code=101, message=Switching Protocols, url=https://kubernetes.docker.internal:6443/apis/streams.ibm.com/v1/namespaces/default/streamsprs?resourceVersion=1225180&watch=true}

Response headers:

Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: E3+/F9m256LMtWuBqeHZOmxPV/o=

Websocket opcode: 8, OPCODE_CONTROL_CLOSE