louislam / uptime-kuma

A fancy self-hosted monitoring tool
https://uptime.kuma.pet
MIT License
55.77k stars 5.01k forks source link

page error message“Lost connection to the socket server. Reconnecting...” #4409

Closed macaty closed 7 months ago

macaty commented 7 months ago

⚠️ Please verify that this bug has NOT been raised before.

🛡️ Security Policy

Description

When click on or jump to manage-status-page 【 http://x.x.x.x:6001/manage-status-page】 from status pages【http://x.x.x.x:6001/status/monitor】 ,we got an error message ,like "Lost connection to the socket server. Reconnecting..."

The detailed error message is as follows, please help: error url: http://x.x.x.x:6001/socket.io/?EIO=4&transport=polling&t=OqqPZ0i&sid=bdBgD95f9Dm-G10IAAGH 404 status respone:{"code":1,"message":"Session ID unknown"}

image image

👟 Reproduction steps

refer to Description

👀 Expected behavior

refer to Description

😓 Actual Behavior

refer to Description

🐻 Uptime-Kuma Version

louislam/uptime-kuma:1.23.11

💻 Operating System and Arch

centos 7.9

🌐 Browser

chrome 120.0.6099.225

🐋 Docker Version

No response

🟩 NodeJS Version

No response

📝 Relevant log output

No response

CommanderStorm commented 7 months ago
macaty commented 7 months ago
  1. deploy to docker env docker-compose.yaml
    version: '3'
    services:
     uptimekuma:
       container_name: uptimekuma
       image: 'louislam/uptime-kuma:1.23.11'
       volumes:
         - uptime-kuma:/app/data
       ports:
         - '6001:3001'
       restart: always
    volumes:
     uptime-kuma:
  2. use different port 6001 from 3001, refer to deployment info above.
  3. Yes, everything else is normal. Occasionally, It happens only on status pages switches to manage-status-page.
CommanderStorm commented 7 months ago

Given that the error message is stating that the Session ID [is] unknown, have you tried logging out + back in? What is the Relevant log output?

macaty commented 7 months ago

Yes,I tried several times and got the same result. image image

CommanderStorm commented 7 months ago

just a guess (comparing #4422 and my working setup): What is your docker version? Comparing your setup with his, are there any points which seem similar?

baalchina commented 7 months ago

just a guess (comparing #4422 and my working setup): What is your docker version? Comparing your setup with his, are there any points which seem similar?

Hi, I had the same error in this https://github.com/louislam/uptime-kuma/issues/4422, while my docker is Docker version 20.10.13, build a224086, thank you.

CommanderStorm commented 7 months ago

Okay, so it is not the runtime, nor the docker version. The only think that differes is that both @baalchina and @macaty are running a version of centos.

I have never worked with said OS. Are there any firewalls on said OS that might be active?

baalchina commented 7 months ago

Okay, so it is not the runtime, nor the docker version. The only think that differes is that both @baalchina and @macaty are running a version of centos.

I have never worked with said OS. Are there any firewalls on said OS that might be active?

Well, i disabled firewalld in my rocky linux, and tried to open uptime-kuma in a single pc but three browsers(firefox, edge, chrome), only firefox display the lost connection to the socket server, but soon became normal. I'll try again later. So, should I open any other ports except 3001 ? And is there recommend os by kuma? Maybe ubuntu or debian? Thanks.

louislam commented 7 months ago

Can you reproduce the issue on our demo site?

https://demo.kuma.pet/start-demo

baalchina commented 7 months ago

Can you reproduce the issue on our demo site?

https://demo.kuma.pet/start-demo er...I checked the demo site, seems the same error 400. but the error soon disappear and became normal.

Snipaste_2024-01-29_20-09-40 Snipaste_2024-01-29_20-08-39

louislam commented 7 months ago

Weird. I cannot reproduce.

2 possible directions:

CommanderStorm commented 7 months ago

I can reproduce this (after the 2nd try => not really repeatable)

image

CommanderStorm commented 7 months ago

Let's find commonalities. I am running Chrome 121.0.6167.85 => maybe connected with newer chrome versions?

@louislam what brower are you running? Could you retry this 2x more just to be sure that this is not also happening on your end?

louislam commented 7 months ago

Tried several combinations and tried several times. Also Chrome 121.0.6167.85, maybe I missed some important steps to reproduce.

Test 1:

  1. Refresh the status page / Or type the direct status page url /status/monitor
  2. Click "Go to Dashboard"

Test 2 (From vue router):

  1. Go to /manage-status-page
  2. Click the status page
  3. Click "Go to Dashboard"

image

CommanderStorm commented 7 months ago

I have tried multiple times again and cannot reproduce what I could reproduce before. Seems rarer than I thought. Without any sort of way of reproducing this little shit.

This bug be like:

Going back to basics (i.e. the docs): https://socket.io/docs/v3/troubleshooting-connection-issues/#in-the-network-monitor-of-your-browser

The session ID (included in the sid query parameter) is unknown from the server. That may happen in a multi-server setup.

@macaty might there be a second server somewhere or might there be something caching responses or might there be something tampering with the traffic we send out? From the information provided and from I could gather, I really cannot debug this further. Have you any clue how you are able to reproduce this so regularly and I am not?

louislam commented 7 months ago

https://github.com/socketio/socket.io/issues/4881

Have not look into it yet, but it seems that it is an upstream bug. As it is also reported recently and op doesn't seem to be using multiple nodes.

If it is true, maybe we should rollback to the previous working version.

macaty commented 7 months ago

好吧,所以这不是运行时,也不是 docker 版本。 唯一不同的是,两者@baalchina和@macaty正在运行 centos 版本。

我从未使用过上述操作系统。所述操作系统上是否有可能处于活动状态的防火墙?

docker ps|grep uptime 8f6ef13e16fc louislam/uptime-kuma:1.23.11 "/usr/bin/dumb-init …" 6 days ago Up 6 days (healthy) 0.0.0.0:6001->3001/tcp, :::6001->3001/tcp uptimekuma

docker version Client: Version: 20.10.17 API version: 1.41 Go version: go1.19.5 Git commit: 100c701 Built: Sat Feb 11 17:11:34 2023 OS/Arch: linux/arm64 Context: default Experimental: true

Server: Engine: Version: 20.10.17 API version: 1.41 (minimum version 1.12) Go version: go1.19.5 Git commit: a89b842 Built: Sat Feb 11 13:33:41 2023 OS/Arch: linux/arm64 Experimental: false containerd: Version: 1.6.6 GitCommit: 10c12954828e7c7c9b6e0ea9b0c02b01407d3ae1 runc: Version: 1.1.2 GitCommit: a916309fff0f838eb94e928713dbc3c0d0ac7aa4 docker-init: Version: 0.19.0 GitCommit: de40ad0 root@N1:~# docker images | grep uptime louislam/uptime-kuma 1.23.11 d84df151f227 4 weeks ago 422MB

macaty commented 7 months ago

Weird. I cannot reproduce.

2 possible directions:

  • Browser (Affected by extension?), you should try Firefox or MS Edge without extensions.
  • Something weird in your network, you can try different network like mobile network.

u are right. it become normal where i disable NeatDownloadManager Extension. image

baalchina commented 7 months ago

Weird. I cannot reproduce. 2 possible directions:

  • Browser (Affected by extension?), you should try Firefox or MS Edge without extensions.
  • Something weird in your network, you can try different network like mobile network.

u are right. it become normal where i disable NeatDownloadManager Extension. image

That's it! I am using netdownload too..when I open chrome in Incognito Window, or edge in private mode...the socket error disappeared both in my site and demo site...

hrshv6 commented 5 months ago

In our case, the cause of this error was haproxy. To solve this problem, we needed to add the following two settings to our haproxy.cfg file:

2024-04-02_192016

Just wanted to comment in here for other people looking for a solution.

CommanderStorm commented 5 months ago

Likely completely unrelated to the other posts. NONE of them have stated that they are using haproxy.

[!NOTE] That timeout client-fin is necessary if timeout tunnel is used is documented in the docs ^^

Our knowledge of these niche reverse proxies is not the best and I know that that part of the docs is likely not the best => PRs from people who know more would be appreciated..

kingfisher77 commented 5 months ago

Suddenly the same error occurs in the GUI: Connection to socket server lost. Restoring the connection...

The setup is done with docker compose. It has been running successfully for many months. Around 50 monitors.

Docker log:

uptime-kuma | 2024-04-12T09:15:07+02:00 [AUTH] INFO: Login via token. IP=172.31.0.2
uptime-kuma | 2024-04-12T09:15:07+02:00 [AUTH] INFO: Username from JWT: admin
uptime-kuma | 2024-04-12T09:15:07+02:00 [AUTH] INFO: Successfully logged in user admin. IP=172.31.0.2

And it goes on like this.

We have reduced the monitor history to 1 day and emptied the database (it was about 1 GB in size) without anything changing. All browsers show the same behavior.

We are now somewhat at a loss. Does anyone have any ideas?

Kerryliu commented 5 months ago

Oddly, I'm also getting the same issue. I'm not sure when this happened as I don't usually visit the dashboard - unless a service goes down.

Navigating to the IP directly (e.g. http://192.168.1.123:3001/) gives no issues.

Using a reverse proxy (I'm using traefik) and navigating to something like https://uptime.mydomain.com, results in the behavior that @kingfisher77 described.

kingfisher77 commented 5 months ago

Yes, missed to mention this. We are behind traefik too.

chakflying commented 5 months ago

Looks like you guys are affected by this? #4669

Kerryliu commented 5 months ago

Thank you! It looks like that was the issue.

kingfisher77 commented 5 months ago

Fantastic! Thank you.

cmbcbe commented 2 months ago

For those who use Cloudflare Argo Tunnel, i have add similar issue until I enable the function "Disable chunked encoding" image

Regards

tarocjsu commented 2 months ago

This error not just for domain name access or reverse proxy access, also happen when direct use http://ip_address:3001/

jjoterop commented 1 month ago

I am having this error with GKE:

Current configuration: uptime kuma with nginx as reverse proxy, also tried with caddy and I am facing the same issue:

Global Load Balancer with managedCerts-->Nginx/Caddy --> Uptime Kuma

If I just use a normal LoadBalancer Service everything works fine, but using a reverse proxy as SSL Termination proxy is not working as expected.

Thankyou!

CommanderStorm commented 1 month ago

@jjoterop "not working as expected" is not specific error mesage. Please open a new issue (instead of piling onto a solved one).

Refer to our reverse proxy documentation for tips first.

rthidden commented 1 month ago

I am getting the same error: 'Lost connection to the socket server. Reconnecting...' I have tried it with Microsoft Edge and Google Chrome with no extensions and in private mode. I installed Uptime Kuma for the first time yesterday. I was getting the error locally yesterday and thought it was just my PC or my connection. But then I started using Uptime Kuma on PikaPods today and am still getting the error.

I tried reloading the app from the PikaPods dashboard and logged out of the app. Now, I am not able to log in because of the error.

Browser: Microsoft Edge: Version 128.0.2739.9 (Official Build) stable app, beta channel (64-bit) Google Chrome: Version 127.0.6533.100 (Official Build) (64-bit)

From the developer tools, you can see the error keeps repeating. image

Adding some Chrome screenshots. Got a bit of a different error there.

image

And some friendly advice from Gemini image

rafi commented 3 weeks ago

For me, it was the proxy_timeout setting set too low that caused the socket disconnections.