Open bboreham opened 3 years ago
I added some more logging to app/controls.go:
app/controls.go
@@ -95,9 +96,10 @@ func handleProbeWS(cr ControlRouter) CtxHandlerFunc { respondWith(ctx, w, http.StatusBadRequest, err) return } + log.Infof("Registered probe %s (%d)", probeID, id) defer cr.Deregister(ctx, probeID, id) - if err := codec.WaitForReadError(); err != nil && !xfer.IsExpectedWSCloseError(err) { - log.Errorf("Error on websocket: %v", err) + if err := codec.WaitForReadError(); err != nil /*&& !xfer.IsExpectedWSCloseError(err)*/ { + log.Errorf("Error on websocket from probe %s (%d): %v", probeID, id, err) } } }
and observed this pattern, where different probes disconnect (that's what error 1006 means) then reconnect twice in quick succession:
<app> INFO: 2021/05/18 10:55:12.128837 app starting, version dacfd66e, ID 250f6a02c6baf2f7 <app> INFO: 2021/05/18 10:55:12.128883 command line args: --app.http.address=:80 --mode=app --weave=false <app> INFO: 2021/05/18 10:55:12.129327 Basic authentication disabled <app> INFO: 2021/05/18 10:55:12.129467 listening on :80 <app> INFO: 2021/05/18 10:55:13.107251 Registered probe 7776b1e8bdfca2d2 (2389263251951777280) <app> INFO: 2021/05/18 10:55:13.110027 Registered probe 6c9c4562da09a115 (8839232576504565880) <app> INFO: 2021/05/18 10:55:13.111379 Registered probe 301c3075165d128d (4509635939760995389) <app> INFO: 2021/05/18 10:55:13.111448 Registered probe 63cc85218ecf4fd5 (2096114807635611025) <app> INFO: 2021/05/18 10:55:13.115702 Registered probe 70a1351b675f3949 (1081430012913752367) <app> INFO: 2021/05/18 10:55:13.115929 Registered probe 5bb63ea0a20a3666 (963780891770807684) <app> INFO: 2021/05/18 10:55:13.123519 Registered probe 929c2321520af1f (262033173521075032) <app> INFO: 2021/05/18 10:55:14.109107 Registered probe 22c8b8a2f66f5574 (8775167029614101706) <app> INFO: 2021/05/18 10:55:14.111184 Registered probe 5eaa20ad7d7e2cae (6380512523209295092) <app> ERRO: 2021/05/18 10:55:16.518348 Error on websocket from probe 70a1351b675f3949 (1081430012913752367): websocket: close 1006 (abnormal closure): unexpected EOF <app> INFO: 2021/05/18 10:55:16.518950 Registered probe 70a1351b675f3949 (6950709212411960207) <app> INFO: 2021/05/18 10:55:16.518980 Registered probe 70a1351b675f3949 (1044548149924223819) <app> ERRO: 2021/05/18 10:55:16.519344 Error on websocket from probe 70a1351b675f3949 (6950709212411960207): websocket: close 1006 (abnormal closure): unexpected EOF <app> ERRO: 2021/05/18 10:55:36.237923 Error on websocket from probe 22c8b8a2f66f5574 (8775167029614101706): websocket: close 1006 (abnormal closure): unexpected EOF <app> INFO: 2021/05/18 10:55:36.242291 Registered probe 22c8b8a2f66f5574 (2841541895986255109) <app> INFO: 2021/05/18 10:55:36.436206 Registered probe 22c8b8a2f66f5574 (1836450431459227589) <app> ERRO: 2021/05/18 10:55:36.436618 Error on websocket from probe 22c8b8a2f66f5574 (1836450431459227589): websocket: close 1006 (abnormal closure): unexpected EOF <app> ERRO: 2021/05/18 10:55:51.341544 Error on websocket from probe 7776b1e8bdfca2d2 (2389263251951777280): websocket: close 1006 (abnormal closure): unexpected EOF <app> INFO: 2021/05/18 10:55:51.342401 Registered probe 7776b1e8bdfca2d2 (249717423108416770) <app> INFO: 2021/05/18 10:55:51.342401 Registered probe 7776b1e8bdfca2d2 (3743738419699173713) <app> ERRO: 2021/05/18 10:55:51.342765 Error on websocket from probe 7776b1e8bdfca2d2 (3743738419699173713): websocket: close 1006 (abnormal closure): unexpected EOF <app> INFO: 2021/05/18 10:56:02.506321 Registered probe 6c9c4562da09a115 (7854952504722763749) <app> INFO: 2021/05/18 10:56:02.507742 Registered probe 6c9c4562da09a115 (2175365361958449149) <app> ERRO: 2021/05/18 10:56:02.509808 Error on websocket from probe 6c9c4562da09a115 (7854952504722763749): websocket: close 1006 (abnormal closure): unexpected EOF <app> ERRO: 2021/05/18 10:56:02.510059 Error on websocket from probe 6c9c4562da09a115 (8839232576504565880): websocket: close 1006 (abnormal closure): unexpected EOF
I added some more logging to
app/controls.go
:and observed this pattern, where different probes disconnect (that's what error 1006 means) then reconnect twice in quick succession: