Open Kristian-Tan opened 1 year ago
Hm, haven't seen such error yet
Maybe the endpoint will refresh the browser page or click the "Reconnect" button on the browser?
Call /stop and /start - it'll do the same. We could add /restart
which would do the same, just as a shortcut
Calling /stop then /start endpoint didn't work (with or without mounted/attached storage for session persistence in pro).
If I do that, the result is scan QR code page with loading animation in place of QR code (but the QR code never loads, just loading forever, like attachment below)
In order to fix that, I have to do this
The problem can be fixed by calling endpoint to logout, then calling endpoint to stop session, then calling endpoint to logout again, then calling endpoint to start session, then scan the QR code.
Also, I think it would be helpful if you add an endpoint to check if "computer not connected" error happened (maybe added inside GET /api/sessions endpoint?)
Another thing: I don't know how to repro this error (as stated in the 'random' in the title)
Maybe this happened because one of the chromium process was killed by OOMKiller. Here's an output that I found from serial0. But I'm not very sure of it. Maybe 1 session spawned multiple chromium processes, then if one of those process is killed because of out of memory, whatsapp will report it as 'unable to connect'?
Interestingly though, the RAM usage is only 14GB (including buffers) out of 16G. If I exclude buffers, the usage is only around 8GB. Will try to experiment with more RAM and see if this problem persists.
Edit: I also see that the docker mount bind for sessions is taking a large space (hundreds of MB), and my disk space was also running low. This may have been the cause too (out of disk space)
Edit 2: maybe the cause of why it works for a short time after calling logout endpoint is because logout endpoint is clearing content of mount bind for sessions, therefore making space in the disk that was previously full
It happened last time around 2024-08-09T09:xx:xx (last Friday)
At that time, our version is not the latest one (based on the image pull date, it was pulled from https://hub.docker.com/r/devlikeapro/waha at Jun 29, while image timestamp said it was created 6 week ago, so the image should be created at about Jun 2?)
Hopefully with latest image the issue is resolved
Self hosted. The docker runs on a QEMU-KVM virtual machine.
I'm quite sure I'm not hitting RAM/CPU bottleneck since I'm allocating very large resource to the VM (20+vCPU, 50GB+ RAM, which the CPU usage never higher than 0.5 when I use tools like htop
/uptime
and the RAM usage never hits higher than 10GB)
Randomly, some of my sessions get such error:
It's not just once or twice, it occurs about a few times a month.
The problem can be fixed by calling endpoint to logout, then calling endpoint to stop session, then calling endpoint to logout again, then calling endpoint to start session, then scan the QR code.
When this "computer not connected" error happened, the webhook and all send (file/text) endpoint for that session stopped working. So whenever this happened I need to manually do the steps above.
But it's getting inconvenient, can you add an endpoint to fix this connection problem? Maybe the endpoint will refresh the browser page or click the "Reconnect" button on the browser?
I'm quite sure it's not the quality of my connection because when I run 4 sessions, only 1 random session is affected with this "computer not connected" error. If it's because of my internet connection is unstable, I think it should affect all 4 sessions, not just 1 out of 4.
Engine: WebJS