esl / MongooseIM

MongooseIM is Erlang Solutions' robust, scalable and efficient XMPP server, aimed at large installations. Specifically designed for enterprise purposes, it is fault-tolerant and can utilise the resources of multiple clustered machines.
Other
1.67k stars 428 forks source link

ETS memory filling up with the increase in user base #3113

Closed jaspreet-android closed 1 year ago

jaspreet-android commented 3 years ago

MongooseIM version: 3.6.0 Installed from: source Erlang/OTP version: 22.x

Mongooseim is filling up ram and not reducing on low usage time, we have around 10,000 congruent users with 3 nodes in cluster. Our instances are 20 GB RAM and 8 core for all 3 nodes.

Here is memory span taken from mongooseim debug mod :

(mongooseim@mongooseim-01)1> erlang:memory(). [{total,15273742784}, {processes,371082608}, {processes_used,370246392}, {system,14902660176}, {atom,916433}, {atom_used,888532}, {binary,1512364912}, {code,17795636}, {ets,13348480784}]

arcusfelis commented 3 years ago

Hi!

Try the snippet to check which processes consume the most amount of memory:

rp(lists:reverse([{Len, case erlang:process_info(TopPid, registered_name) of {_,X} -> X; _ -> TopPid end, erlang:process_info(TopPid, current_stacktrace)} || {Len, TopPid} <- lists:sublist(lists:reverse(lists:keysort(1, [{try element(2,erlang:process_info(Pid, memory)) catch _:_ -> -1 end, Pid} || Pid <- erlang:processes()])), 20)])).
jaspreet-android commented 3 years ago

@arcusfelis Please see below out out from node 1 :

(mongooseim@mongooseim-01)1> erlang:memory().
[{total,15377272032},
 {processes,420491752},
 {processes_used,419461160},
 {system,14956780280},
 {atom,916433},
 {atom_used,888630},
 {binary,1481635240},
 {code,17795636},
 {ets,13434174664}]

(mongooseim@mongooseim-01)2> rp(lists:reverse([{Len, case erlang:process_info(TopPid, registered_name) of {_,X} -> X; _ -> TopPid end, erlang:process_info(TopPid, current_stacktrace)} || {Len, TopPid} <- lists:sublist(lists:reverse(lists:keysort(1, [{try element(2,erlang:process_info(Pid, memory)) catch _:_ -> -1 end, Pid} || Pid <- erlang:processes()])), 20)])).
[{426524,<0.1167.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {426524,<0.1169.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {426524,<0.1177.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {460012,<0.645.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {460012,<0.1173.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {460012,<0.1207.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {608272,ejabberd_c2s_sup,
  {current_stacktrace,[{gen_server,loop,7,
                                   [{file,"gen_server.erl"},{line,394}]},
                       {proc_lib,init_p_do_apply,3,
                                 [{file,"proc_lib.erl"},{line,249}]}]}},
 {689540,<0.1025.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {689540,<0.1031.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {689540,<0.1115.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {1115108,<0.986.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {1116444,<0.25171.683>,
  {current_stacktrace,[{erl_eval,do_apply,6,
                                 [{file,"erl_eval.erl"},{line,684}]},
                       {erl_eval,expr_list,6,[{file,"erl_eval.erl"},{line,888}]},
                       {erl_eval,expr,5,[{file,"erl_eval.erl"},{line,240}]},
                       {erl_eval,eval_lc1,6,[{file,"erl_eval.erl"},{line,706}]},
                       {erl_eval,eval_generate,7,
                                 [{file,"erl_eval.erl"},{line,735}]},
                       {erl_eval,eval_lc,6,[{file,"erl_eval.erl"},{line,692}]},
                       {erl_eval,expr_list,6,[{file,"erl_eval.erl"},{line,888}]}]}},
 {1803692,<0.1181.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {1803692,<0.1185.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {2546612,<0.978.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {2547620,<0.1475.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {4720580,<0.1472.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {4720628,<0.643.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {4720628,<0.1470.0>,
  {current_stacktrace,[{exometer_probe,loop,1,
                                       [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/default/lib/exometer_core/src/exometer_probe.erl"},
                                        {line,673}]},
                       {proc_lib,init_p,3,[{file,"proc_lib.erl"},{line,234}]}]}},
 {97126160,
  'ejabberd_mod_pubsub_loop_service51.comerachain.com',
  {current_stacktrace,[{mod_pubsub,send_loop,1,
                                   [{file,"/var/lib/jenkins/workspace/MONGOOSEIM-PROD-CLUSTER/_build/prod/lib/mongooseim/src/pubsub/mod_pubsub.erl"},
                                    {line,456}]}]}}]
ok
jaspreet-android commented 3 years ago

Hi @arcusfelis I further found that pubsub_last_item ets table is taking more then 13 GBS of memory, how can we control it ?

I used :

ets:info(pubsub_last_item,memory) * erlang:system_info(wordsize).

Out put in 3 nodes :

(mongooseim@mongooseim-01)1> ets:info(pubsub_last_item,memory).
1709650491
(mongooseim@mongooseim-02)1> ets:info(pubsub_last_item,memory).
1709658380
(mongooseim@mongooseim-03)1> ets:info(pubsub_last_item,memory).
1709666097
jaspreet-android commented 3 years ago

can any buddy help here ?

jaspreet-android commented 3 years ago

I was able to reduce the ram by purging pubsub_last_item mnesia data cache (ets storage property) for older then certain days users.

arcusfelis commented 3 years ago

Try to use rdbms backend for mod_pubsub.

So, you need to configure the caching backend for pubsub to use RDBMS.

And generally, you want to configure all modules, that support RDBMS, to use RDBMS.

jaspreet-android commented 3 years ago

as it create 13 GB of Ram space, do you think shifting it to mysql will slow down mysql ?

jaspreet-android commented 3 years ago

Some more details from erlang etop:

https://drive.google.com/file/d/1d40FWgw3_wwQjkDYinsNuL5lKoW70Kii/view?usp=sharing

arcusfelis commented 1 year ago

as it create 13 GB of Ram space, do you think shifting it to mysql will slow down mysql ?

It would slow down MySQL, of course, like any module that uses the database. Be aware that unlike Mnesia, MySQL is designed to store these 13GB on disk and only some part of it in RAM. There would be extra calls to the database. But your erlang VM would stop crashing. Also, MongooseIM performance would not slow down a lot.

We don't recommend to use Mnesia to store persistent data, because it was not designed to handle big amount of data (or to store a growing dataset, like Pubsub data). Mnesia was designed to store small amount of config data, which has constant size and is not updated very often.

Mnesia could still be used as sm_backend (session management table), but once CETS backend is implemented and merged to the master, the general rule would be to avoid using Mnesia.

Waiting for some updates before closing the issue.

chrzaszcz commented 1 year ago

Closing because of inactivity and clear recommendations from @arcusfelis