thelounge / thelounge

💬 ‎ Modern, responsive, cross-platform, self-hosted web IRC client
https://thelounge.chat
MIT License
5.59k stars 682 forks source link

The lounge crashes on startup when there are too many users * channels open #4284

Open FvD opened 3 years ago

FvD commented 3 years ago

Our goals was to start theLounge with 1800 users connected, and with 19 channels assigned. This was not possible, as theLounge would crash.

To reproduce the issue: Creating many user profiles (json files) and assigning channels to them before starting the lounge. We found that we can crash the server either by creating too many users, or by assigning too many channels to users.

We used:

userDiscconnect=FALSE

So that when theLounge starts up they will immediately be connected to the network.

We tried to assign more memory:

export NODE_OPTIONS=--max_old_space_size=55366

This allowed a few more users to connect, but then around user 1000 it will relentlessly crash again. We tried a 8cpu 64 GB machine, but theLounge is unable to use all the available memory.

When we assigned the new users to only 19 channels, the server crashed arround 300 connections.

We did try to understand JavaScript memory usage and tried the following:

export NODE_OPTIONS="--max_old_space_size=55366 --max_semi_space_size=55366"

But this was to no avail. We suspect that there is something happening with memory usage, but we do no know theLounge code and do not understand what could be happening. Errors are either :

FATAL ERROR: Scavenger: semi-space copy Allocation failed - JavaScript heap out of memory

or:

terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)

All this is with Message Storage inactive. Because if we keep that active we have an additional problem of our disk writing to be too high, and the whole systems slows down.

brunnre8 commented 3 years ago

I wouldn't exactly mark an out of memory situation as a bug. Yes, the code can probably be written in a more memory conscious manner but that's likely not that easy.

Did you profile this? Would be nice to know what the big memory chunk consists of.

eliocamp commented 3 years ago

No, we didn't run a profiler. We'd be happy in assisting you to try reproduce the problem, though.