devlikeapro / waha

WAHA - WhatsApp HTTP API (REST API) that you can configure in a click! Two engines: chromium-based WEBJS and pure-websocket NOWEB
https://waha.devlike.pro/
Apache License 2.0
1.02k stars 310 forks source link

Random internet connection problem #230

Open Kristian-Tan opened 1 year ago

Kristian-Tan commented 1 year ago

Screenshot_20231108_151930

Randomly, some of my sessions get such error:

Computer not connected Make sure your computer has an active Internet connection. Reconnect

It's not just once or twice, it occurs about a few times a month.

The problem can be fixed by calling endpoint to logout, then calling endpoint to stop session, then calling endpoint to logout again, then calling endpoint to start session, then scan the QR code.

When this "computer not connected" error happened, the webhook and all send (file/text) endpoint for that session stopped working. So whenever this happened I need to manually do the steps above.

But it's getting inconvenient, can you add an endpoint to fix this connection problem? Maybe the endpoint will refresh the browser page or click the "Reconnect" button on the browser?

I'm quite sure it's not the quality of my connection because when I run 4 sessions, only 1 random session is affected with this "computer not connected" error. If it's because of my internet connection is unstable, I think it should affect all 4 sessions, not just 1 out of 4.

Engine: WebJS

allburov commented 1 year ago

Hm, haven't seen such error yet

Maybe the endpoint will refresh the browser page or click the "Reconnect" button on the browser?

Call /stop and /start - it'll do the same. We could add /restart which would do the same, just as a shortcut

Kristian-Tan commented 1 year ago

Calling /stop then /start endpoint didn't work (with or without mounted/attached storage for session persistence in pro).

If I do that, the result is scan QR code page with loading animation in place of QR code (but the QR code never loads, just loading forever, like attachment below)

image

In order to fix that, I have to do this

The problem can be fixed by calling endpoint to logout, then calling endpoint to stop session, then calling endpoint to logout again, then calling endpoint to start session, then scan the QR code.

Also, I think it would be helpful if you add an endpoint to check if "computer not connected" error happened (maybe added inside GET /api/sessions endpoint?)

Another thing: I don't know how to repro this error (as stated in the 'random' in the title)

Kristian-Tan commented 1 year ago

Maybe this happened because one of the chromium process was killed by OOMKiller. Here's an output that I found from serial0. But I'm not very sure of it. Maybe 1 session spawned multiple chromium processes, then if one of those process is killed because of out of memory, whatsapp will report it as 'unable to connect'?

image

Interestingly though, the RAM usage is only 14GB (including buffers) out of 16G. If I exclude buffers, the usage is only around 8GB. Will try to experiment with more RAM and see if this problem persists.

Edit: I also see that the docker mount bind for sessions is taking a large space (hundreds of MB), and my disk space was also running low. This may have been the cause too (out of disk space)

Edit 2: maybe the cause of why it works for a short time after calling logout endpoint is because logout endpoint is clearing content of mount bind for sessions, therefore making space in the disk that was previously full

devlikepro commented 3 months ago

@Kristian-Tan hi! Do you have any problems or updates about the issue? We've made few adjustements in session management so it should consume less memory and spaces

patron:PRO

Kristian-Tan commented 3 months ago

It happened last time around 2024-08-09T09:xx:xx (last Friday)

At that time, our version is not the latest one (based on the image pull date, it was pulled from https://hub.docker.com/r/devlikeapro/waha at Jun 29, while image timestamp said it was created 6 week ago, so the image should be created at about Jun 2?)

Hopefully with latest image the issue is resolved

devlikepro commented 3 months ago

that's strange, where do you host the server? Do you use proxy?

patron:PRO

Kristian-Tan commented 3 months ago

Self hosted. The docker runs on a QEMU-KVM virtual machine. I'm quite sure I'm not hitting RAM/CPU bottleneck since I'm allocating very large resource to the VM (20+vCPU, 50GB+ RAM, which the CPU usage never higher than 0.5 when I use tools like htop/uptime and the RAM usage never hits higher than 10GB)