sevdokimov / log-viewer

Web UI for viewing logs
Apache License 2.0
434 stars 103 forks source link

Shows disconnected after a while #163

Closed sjtuross closed 1 year ago

sjtuross commented 1 year ago

If I don't interact with the tool for just a while like 1 minute, it shows disconnected as below. Any idea how to prevent it?

image

holograph commented 1 year ago

Ha, I was literally on my way to log this issue 😆

Adding a bit of info:

holograph commented 1 year ago

A bit more:

      1006 is a reserved value and MUST NOT be set as a status code in a
      Close control frame by an endpoint.  It is designated for use in
      applications expecting a status code to indicate that the
      connection was closed abnormally, e.g., without sending or
      receiving a Close control frame.

This tells me either the client or the server simply dropped the TCP connection. The server logs show:

│ 2023-02-20_10:21:29.619 [qtp1494346128-7824] INFO  c.logviewer.web.LogViewerWebsocket - Connection opened: 60 <anonymous>                                                      │
│ 2023-02-20_10:24:09.973 [qtp1494346128-7745] ERROR c.logviewer.web.LogViewerWebsocket - Connection closed (CloseReason[1006,Disconnected]) 60 <anonymous>
holograph commented 1 year ago

A quick tcpdump didn't help, but there does not appear any way to reconnect beyond reloading the page. Perhaps (if we can't figure out the cause here) add a reconnect mechanism?

ShadowVoyd commented 1 year ago

Just in case you or anyone else might be using this through nginx as a reverse proxy and are experiencing this issue, you need to specify a timeout for that location block that proxies it.

proxy_connect_timeout 1d;
proxy_send_timeout 1d;
proxy_read_timeout 1d;

p.s. I recommend not exposing this software through a reverse proxy to the internet unless you have it behind some sort of security hardened authentication (authelia is a good working example).

sjtuross commented 1 year ago

Thank you for the tips. I indeed have nginx as a reverse proxy. The issue disappears with the suggested proxy timeout options.

holograph commented 1 year ago

This affects virtually anyone behind an L7 (probably L4 as well) proxy, so I think this issue isn't really closed. IMO we can:

  1. Add a simple "ping every second" mechanism via the websocket (I'll try to take a stab at it)
  2. Add recommended settings when behind a load balancer, including a more sensible timeout (say 15 minutes). @sevdokimov what do you think?
ShadowVoyd commented 1 year ago

This affects virtually anyone behind an L7 (probably L4 as well) proxy, so I think this issue isn't really closed. IMO we can:

  1. Add a simple "ping every second" mechanism via the websocket (I'll try to take a stab at it)
  2. Add recommended settings when behind a load balancer, including a more sensible timeout (say 15 minutes). @sevdokimov what do you think?

Websocket only operates at OSI layer 7. If you are using a load balancer like kubernetes and nginx, have you tried setting the times out in the ingress nginx controller? https://loft.sh/blog/kubernetes-nginx-ingress-10-useful-configuration-options/#timeout-settings

holograph commented 1 year ago

@ShadowVoyd While conceptually easy, it's not always a practical possibility. In one case we use Traefik with a common entrypoint for the whole cluster; timeout settings are per entry point, and adding one can incur nontrivial complexity (additional TCP port, upstream NLB configuration etc.)

Furthermore, that configuration is not owned by the team I work with, so they actually can't make that change. In my experience most websocket-based apps end up implementing their own simple heartbeat mechanism - there's supposed to be one built into the websocket spec, but apparently there's no standard JavaScript API so virtually no-one implements it.

I believe a client-side "I'm here" with a server-side "thumbs up" every 10 seconds is a cheap enough solution to what can easily become an organizational black hole, and it does seem to be the norm for these types of implementations. I'd be happy to implement this, but being basically a front-end luddite I'd be happy for a pointer in the right direction @sevdokimov :-D