esl / MongooseIM

MongooseIM is Erlang Solutions' robust, scalable and efficient XMPP server, aimed at large installations. Specifically designed for enterprise purposes, it is fault-tolerant and can utilise the resources of multiple clustered machines.
Other
1.67k stars 427 forks source link

Websocket errors #260

Closed trungvu12 closed 10 years ago

trungvu12 commented 10 years ago

Hi,

I have cluster mongoose include 3 instances with DNS round-robin load balancing, our clients use websocket to connect to the server (secure websocket). I got a lot of following error in the crash.log:

2014-08-05 19:40:04 =ERROR REPORT==== Ranch listener {mod_websockets,secure} had connection process started with cowboy_protocol:start_link/4 at <0.23872.0> exit with reason: {[{reason,{badmatch,{error,{"junk after document element",<<"Yz1iaXdzLHI9ZDQxZDhjZDk4ZjAwYjIwNGU5ODAwOTk4ZWNmODQyN2VoVTNxbHhlbzRsNDlsa3FiTFdoMmd3PT0scD1uclh5aWE3N2tqV1NhWFYxY0ZpQXY0cWtIRXM9">>}}}},{mfa,{mod_websockets,websocket_handle,3}},{stacktrace,[{mod_websockets,handle_text,2,[{file,"src/mod_websockets.erl"},{line,226}]},{mod_websockets,websocket_handle,3,[{file,"src/mod_websockets.erl"},{line,179}]},{cowboy_websocket,handler_call,7,[{file,"src/cowboy_websocket.erl"},{line,598}]},{cowboy_protocol,execute,4,[{file,"src/cowboy_protocol.erl"},{line,529}]}]},{msg,{text,<<"Yz1iaXdzLHI9ZDQxZDhjZDk4ZjAwYjIwNGU5ODAwOTk4ZWNmODQyN2VoVTNxbHhlbzRsNDlsa3FiTFdoMmd3PT0scD1uclh5aWE3N2tqV1NhWFYxY0ZpQXY0cWtIRXM9">>}},{req,[{socket,{sslsocket,{gen_tcp,#Port<0.12693>,tls_connection},<0.23869.0>}},{transport,ranch_ssl},{connection,keepalive},{pid,<0.23872.0>},{method,<<"GET">>},{version,'HTTP/1.1'},{peer,{{27,72,89,62},55761}},{host,<<"imis01.imcluster.ms">>},{host_info,undefined},{port,443},{path,<<"/ws-xmpp">>},{path_info,undefined},{qs,<<"username=vinhloc810%40imcluster.ms&password=7MSGwsENQj=webibe29">>},{qs_vals,undefined},{bindings,[]},{headers,[{<<"upgrade">>,<<"websocket">>},{<<"connection">>,<<"Upgrade">>},{<<"host">>,<<"imis01.imcluster.ms">>},{<<"origin">>,<<"https://imcluster.ms">>},{<<"sec-websocket-protocol">>,<<"xmpp">>},{<<"pragma">>,<<"no-cache">>},{<<"cache-control">>,<<"no-cache">>},{<<"sec-websocket-key">>,<<"sVF8F0uWMGPFmOwx3xdZXg==">>},{<<"sec-websocket-version">>,<<"13">>},{<<"sec-websocket-extensions">>,<<"permessage-deflate; client_max_window_bits, x-webkit-deflate-frame">>},{<<"user-agent">>,<<"Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.157 CoRom/35.0.1916.157 Safari/537.36">>}]},{p_headers,[{<<"sec-websocket-extensions">>,[{<<"permessage-deflate">>,[<<"client_max_window_bits">>]},{<<"x-webkit-deflate-frame">>,[]}]},{<<"upgrade">>,[<<"websocket">>]},{<<"connection">>,[<<"upgrade">>]}]},{cookies,undefined},{meta,[{websocket_compress,false},{websocket_version,13}]},{body_state,waiting},{multipart,undefined},{buffer,<<>>},{resp_compress,false},{resp_state,done},{resp_headers,[]},{resp_body,<<>>},{onresponse,undefined}]},{state,{ws_state,<0.23873.0>,undefined,{parser,<<>>,{config,false,false},[]}}}],[{cowboy_protocol,execute,4,[{file,"src/cowboy_protocol.erl"},{line,529}]}]} 2014-08-05 19:40:06 =ERROR REPORT==== Error in process <0.23886.0> on node 'mongooseim@mogooseNode02' with exit value: {[{reason,{badmatch,{error,{"junk after document element",<<114 bytes>>}}}},{mfa,{mod_websockets,websocket_handle,3}},{stacktrace,[{mod_websockets,handle_text,2,[{file,"src/mod_websockets.erl"},{line,226}]},{mod_websockets,websocket_handle...

2014-08-05 19:40:06 =ERROR REPORT==== Ranch listener {mod_websockets,secure} had connection process started with cowboy_protocol:start_link/4 at <0.23886.0> exit with reason: {[{reason,{badmatch,{error,{"junk after document element",<<"">>}}}},{mfa,{mod_websockets,websocket_handle,3}},{stacktrace,[{mod_websockets,handle_text,2,[{file,"src/mod_websockets.erl"},{line,226}]},{mod_websockets,websocket_handle,3,[{file,"src/mod_websockets.erl"},{line,179}]},{cowboy_websocket,handler_call,7,[{file,"src/cowboy_websocket.erl"},{line,598}]},{cowboy_protocol,execute,4,[{file,"src/cowboy_protocol.erl"},{line,529}]}]},{msg,{text,<<"">>}},{req,[{socket,{sslsocket,{gen_tcp,#Port<0.12697>,tls_connection},<0.23880.0>}},{transport,ranch_ssl},{connection,keepalive},{pid,<0.23886.0>},{method,<<"GET">>},{version,'HTTP/1.1'},{peer,{{114,32,136,94},46304}},{host,<<"imis01.imcluster.ms">>},{host_info,undefined},{port,443},{path,<<"/ws-xmpp">>},{path_info,undefined},{qs,<<"username=chunch.hsu%40imcluster.ms&password=TwzY1OU7&resource=clientnw">>},{qs_vals,undefined},{bindings,[]},{headers,[{<<"upgrade">>,<<"websocket">>},{<<"connection">>,<<"Upgrade">>},{<<"host">>,<<"imis01.imcluster.ms">>},{<<"origin">>,<<"https://imcluster.ms">>},{<<"sec-websocket-protocol">>,<<"xmpp">>},{<<"pragma">>,<<"no-cache">>},{<<"cache-control">>,<<"no-cache">>},{<<"sec-websocket-key">>,<<"wC4jqLoAwRyxVL9xFJ+fPA==">>},{<<"sec-websocket-version">>,<<"13">>},{<<"sec-websocket-extensions">>,<<"permessage-deflate; client_max_window_bits, x-webkit-deflate-frame">>},{<<"user-agent">>,<<"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (@161254) (KHTML, like Gecko) Chrome/30.1.2345.67 Safari/537.36 NodeWebkit/0.9.2 Unseen/0.1.7sion">>}]},{p_headers,[{<<"sec-websocket-extensions">>,[{<<"permessage-deflate">>,[<<"client_max_window_bits">>]},{<<"x-webkit-deflate-frame">>,[]}]},{<<"upgrade">>,[<<"websocket">>]},{<<"connection">>,[<<"upgrade">>]}]},{cookies,undefined},{meta,[{websocket_compress,false},{websocket_version,13}]},{body_state,waiting},{multipart,undefined},{buffer,<<>>},{resp_compress,false},{resp_state,done},{resp_headers,[]},{resp_body,<<>>},{onresponse,undefined}]},{state,{ws_state,<0.23887.0>,undefined,{parser,<<>>,{config,false,false},[]}}}],[{cowboy_protocol,execute,4,[{file,"src/cowboy_protocol.erl"},{line,529}]}]} 2014-08-05 19:40:07 =ERROR REPORT==== Error in process <0.23889.0> on node 'mongooseim@mogooseNode02' with exit value: {[{reason,{badmatch,{error,{"junk after document element",<<190 bytes>>}}}},{mfa,{mod_websockets,websocket_handle,3}},{stacktrace,[{mod_websockets,handle_text,2,[{file,"src/mod_websockets.erl"},{line,226}]},{mod_websockets,websocket_handle...

2014-08-05 19:37:18 =ERROR REPORT==== Ranch listener {mod_websockets,secure} had connection process started with cowboy_protocol:start_link/4 at <0.15948.0> exit with reason: {[{reason,{badmatch,{error,{"junk after document element",<<"Yz1iaXdzLHI9ZDQxZDhjZDk4ZjAwYjIwNGU5ODAwOTk4ZWNmODQyN2VmTzhzc0VnbVRtbUZkd3BPQWVEeTJnPT0scD1VOWxzcFBDVmxHRHZzWmVoRUxBL0JzRHhLMU09">>}}}},{mfa,{mod_websockets,websocket_handle,3}},{stacktrace,[{mod_websockets,handle_text,2,[{file,"src/mod_websockets.erl"},{line,226}]},{mod_websockets,websocket_handle,3,[{file,"src/mod_websockets.erl"},{line,179}]},{cowboy_websocket,handler_call,7,[{file,"src/cowboy_websocket.erl"},{line,598}]},{cowboy_protocol,execute,4,[{file,"src/cowboy_protocol.erl"},{line,529}]}]},{msg,{text,<<"Yz1iaXdzLHI9ZDQxZDhjZDk4ZjAwYjIwNGU5ODAwOTk4ZWNmODQyN2VmTzhzc0VnbVRtbUZkd3BPQWVEeTJnPT0scD1VOWxzcFBDVmxHRHZzWmVoRUxBL0JzRHhLMU09">>}},{req,[{socket,{sslsocket,{gen_tcp,#Port<0.9418>,tls_connection},<0.15938.0>}},{transport,ranch_ssl},{connection,keepalive},{pid,<0.15948.0>},{method,<<"GET">>},{version,'HTTP/1.1'},{peer,{{101,172,170,161},62222}},{host,<<"imis01.imcluster.ms">>},{host_info,undefined},{port,443},{path,<<"/ws-xmpp">>},{path_info,undefined},{qs,<<"username=mario3o%40imcluster.ms&password=RpzrsuD6fem&resource=webe6d2t9">>},{qs_vals,undefined},{bindings,[]},{headers,[{<<"upgrade">>,<<"websocket">>},{<<"connection">>,<<"Upgrade">>},{<<"host">>,<<"imis01.imcluster.ms">>},{<<"origin">>,<<"https://imcluster.ms">>},{<<"sec-websocket-protocol">>,<<"xmpp">>},{<<"pragma">>,<<"no-cache">>},{<<"cache-control">>,<<"no-cache">>},{<<"sec-websocket-key">>,<<"WRdMiCMm75E5wf0rs5Nb1w==">>},{<<"sec-websocket-version">>,<<"13">>},{<<"sec-websocket-extensions">>,<<"permessage-deflate; client_max_window_bits, x-webkit-deflate-frame">>},{<<"user-agent">>,<<"Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36">>}]},{p_headers,[{<<"sec-websocket-extensions">>,[{<<"permessage-deflate">>,[<<"client_max_window_bits">>]},{<<"x-webkit-deflate-frame">>,[]}]},{<<"upgrade">>,[<<"websocket">>]},{<<"connection">>,[<<"upgrade">>]}]},{cookies,undefined},{meta,[{websocket_compress,false},{websocket_version,13}]},{body_state,waiting},{multipart,undefined},{buffer,<<>>},{resp_compress,false},{resp_state,done},{resp_headers,[]},{resp_body,<<>>},{onresponse,undefined}]},{state,{ws_state,<0.15949.0>,undefined,{parser,<<>>,{config,false,false},[]}}}],[{cowboy_protocol,execute,4,[{file,"src/cowboy_protocol.erl"},{line,529}]}]} 2014-08-05 19:37:19 =ERROR REPORT==== Error in process <0.15945.0> on node 'mongooseim@mogooseNode01' with exit value: {[{reason,{badmatch,{error,{"junk after document element",<<190 bytes>>}}}},{mfa,{mod_websockets,websocket_handle,3}},{stacktrace,[{mod_websockets,handle_text,2,[{file,"src/mod_websockets.erl"},{line,226}]},{mod_websockets,websocket_handle...

2014-08-05 19:37:19 =ERROR REPORT==== Ranch listener {mod_websockets,secure} had connection process started with cowboy_protocol:start_link/4 at <0.15945.0> exit with reason: {[{reason,{badmatch,{error,{"junk after document element",<<"Yz1iaXdzLHI9ZDQxZDhjZDk4ZjAwYjIwNGU5ODAwOTk4ZWNmODQyN2ViZStaOWc4WEJmN0s2TmU1OWlvREdBPT0scD0wd05OdWFocWtJNFNNejRtMUhKZUNuM0k4SGM9">>}}}},{mfa,{mod_websockets,websocket_handle,3}},{stacktrace,[{mod_websockets,handle_text,2,[{file,"src/mod_websockets.erl"},{line,226}]},{mod_websockets,websocket_handle,3,[{file,"src/mod_websockets.erl"},{line,179}]},{cowboy_websocket,handler_call,7,[{file,"src/cowboy_websocket.erl"},{line,598}]},{cowboy_protocol,execute,4,[{file,"src/cowboy_protocol.erl"},{line,529}]}]},{msg,{text,<<"Yz1iaXdzLHI9ZDQxZDhjZDk4ZjAwYjIwNGU5ODAwOTk4ZWNmODQyN2ViZStaOWc4WEJmN0s2TmU1OWlvREdBPT0scD0wd05OdWFocWtJNFNNejRtMUhKZUNuM0k4SGM9">>}},{req,[{socket,{sslsocket,{gen_tcp,#Port<0.9417>,tls_connection},<0.15937.0>}},{transport,ranch_ssl},{connection,keepalive},{pid,<0.15945.0>},{method,<<"GET">>},{version,'HTTP/1.1'},{peer,{{184,53,0,24},63012}},{host,<<"imis01.imcluster.ms">>},{host_info,undefined},{port,443},{path,<<"/ws-xmpp">>},{path_info,undefined},{qs,<<"username=livef3%40imcluster.ms&password=hnxkVUA9&resource=webn">>},{qs_vals,undefined},{bindings,[]},{headers,[{<<"host">>,<<"imis01.imcluster.ms">>},{<<"user-agent">>,<<"Mozilla/5.0 (X11; Ubuntu; Linux x8664; rv:30.0) Gecko/20100101 Firefox/30.0">>},{<<"accept">>,<<"text/html,application/xhtml+xml,application/xml;q=0.9,/_;q=0.8">>},{<<"accept-language">>,<<"en-US,en;q=0.5">>},{<<"accept-encoding">>,<<"gzip, deflate">>},{<<"sec-websocket-version">>,<<"13">>},{<<"origin">>,<<"https://imcluster.ms">>},{<<"sec-websocket-protocol">>,<<"xmpp">>},{<<"sec-websocket-key">>,<<"/aXgmF5Vsk9o3m3Z6eUkSQ==">>},{<<"connection">>,<<"keep-alive, Upgrade">>},{<<"pragma">>,<<"no-cache">>},{<<"cache-control">>,<<"no-cache">>},{<<"upgrade">>,<<"websocket">>}]},{p_headers,[{<<"upgrade">>,[<<"websocket">>]},{<<"connection">>,[<<"keep-alive">>,<<"upgrade">>]}]},{cookies,undefined},{meta,[{websocket_compress,false},{websocket_version,13}]},{body_state,waiting},{multipart,undefined},{buffer,<<>>},{resp_compress,false},{resp_state,done},{resp_headers,[]},{resp_body,<<>>},{onresponse,undefined}]},{state,{ws_state,<0.15946.0>,undefined,{parser,<<>>,{config,false,false},[]}}}],[{cowboy_protocol,execute,4,[{file,"src/cowboy_protocol.erl"},{line,529}]}]} 2014-08-05 19:37:19 =ERROR REPORT==== Error in process <0.15891.0> on node 'mongooseim@mogooseNode01' with exit value: {[{reason,{badmatch,{error,{"junk after document element",<<190 bytes>>}}}},{mfa,{mod_websockets,websocket_handle,3}},{stacktrace,[{mod_websockets,handle_text,2,[{file,"src/mod_websockets.erl"},{line,226}]},{mod_websockets,websocket_handle...

Is there anything wrong with my servers?

Sometime the server not able to handle clients request (get rosters, ...) and just hang and I have to restart the node. I have only 500 active users in my cluster.

Please help,

Thanks.

fenek commented 10 years ago

This error is related to XML parsing. Please set log level in MongooseIM to debug (5) or trace outgoing traffic on client side. If you choose the former, look for log lines with mod_websockets and Received phrases.

fenek commented 10 years ago

As for broken node: does it use 100% CPU or on the contrary - it is completely frozen? Can it serve other users' requests? Can users connect to the node at all?

trungvu12 commented 10 years ago

The CPU is not 100% but it's frozen. Authentication is working but can not send presence 'online' status, get roster or anything else.

The error with websocket happened when our users disconnect from server and try to reconnect again. Some of them are able to reconnect to the server but many could not. The server throw errors in the crash.log.

I make a loadtest on my Mongoose cluster it's able to handle 200k active sessions but when I deploy to our production environment (to replace openfire cluster) this issue is happening. I am not sure why mongoose instance is frozen.

I use MySQL (percona 5.6) and Redis as session storage. Here is my configuration files:

vm.args

Name of the node

-sname mongooseim@mogooseNode01

Cookie for distributed erlang

-setcookie ejabberd

Heartbeat management; auto-restarts VM if it dies or becomes unresponsive

(Disabled by default..use with caution!)

-heart

Enable kernel poll and a few async threads

+K true +A 5 +P 10000000

Increase number of concurrent ports/sockets

-env ERL_MAX_PORTS 350000

Tweak GC to run more often

-env ERL_FULLSWEEP_AFTER 2

With lager sasl reports are redundant so turn them off

-sasl sasl_error_logger false

-kernel inet_dist_listen_min 50000 inet_dist_listen_max 50100


ejabberd.cfg

%%% %%% ejabberd configuration file %%% %%%'

%%% The parameters used in this configuration file are explained in more detail %%% in the ejabberd Installation and Operation Guide. %%% Please consult the Guide in case of doubts, it is included with %%% your copy of ejabberd, and is also available online at %%% http://www.process-one.net/en/ejabberd/docs/

%%% This configuration file contains Erlang terms. %%% In case you want to understand the syntax, here are the concepts: %%% %%% - The character to comment a line is % %%% %%% - Each term ends in a dot, for example: %%% override_global. %%% %%% - A tuple has a fixed definition, its elements are %%% enclosed in {}, and separated with commas: %%% {loglevel, 4}. %%% %%% - A list can have as many elements as you want, %%% and is enclosed in [], for example: %%% [http_poll, web_admin, tls] %%% %%% Pay attention that list elements are delimited with commas, %%% but no comma is allowed after the last list element. This will %%% give a syntax error unlike in more lenient languages (e.g. Python). %%% %%% - A keyword of ejabberd is a word in lowercase. %%% Strings are enclosed in "" and can contain spaces, dots, ... %%% {language, "en"}. %%% {ldap_rootdn, "dc=example,dc=com"}. %%% %%% - This term includes a tuple, a keyword, a list, and two strings: %%% {hosts, ["jabber.example.net", "im.example.com"]}. %%% %%% - This config is preprocessed during release generation by a tool which %%% interprets double curly braces as substitution markers, so avoid this %%% syntax in this file (though it's valid Erlang). %%% %%% So this is OK (though arguably looks quite ugly): %%% { {s2s_addr, "example-host.net"}, {127,0,0,1} }. %%% %%% And I can't give an example of what's not OK exactly because %%% of this rule. %%%

%%%. ======================= %%%' OVERRIDE STORED OPTIONS

%% %% Override the old values stored in the database. %%

%% %% Override global options (shared by all ejabberd nodes in a cluster). %% %%override_global.

%% %% Override local options (specific for this particular ejabberd node). %% %%override_local.

%% %% Remove the Access Control Lists before new ones are added. %% %%override_acls.

%%%. ========= %%%' DEBUGGING

%% %% loglevel: Verbosity of log files generated by ejabberd. %% 0: No ejabberd log at all (not recommended) %% 1: Critical %% 2: Error %% 3: Warning %% 4: Info %% 5: Debug %% {loglevel, 0}.

%% %% alarms: an optional alarm handler, subscribed to system events %% long_gc: minimum GC time in ms for long_gc alarm %% large_heap: minimum process heap size for large_heap alarm %% handlers: a list of alarm handlers %% - alarms_basic_handler: logs alarms and stores a brief alarm summary %% - alarms_folsom_handler: stores alarm details in folsom metrics %% %% Example: %% {alarms, %% [{long_gc, 10000}, %% {large_heap, 1000000}, %% {handlers, [alarms_basic_handler, %% alarms_folsom_handler]}] %% }.

%% %% watchdog_admins: Only useful for developers: if an ejabberd process %% consumes a lot of memory, send live notifications to these XMPP %% accounts. Requires alarms (see above). %% %%{watchdog_admins, ["bob@example.com"]}.

%%%. ================ %%%' SERVED HOSTNAMES

%% %% hosts: Domains served by ejabberd. %% You can define one or several, for example: %% {hosts, ["example.net", "example.com", "example.org"]}. %% {hosts, ["imcluster.ms"] }.

%% %% route_subdomains: Delegate subdomains to other XMPP servers. %% For example, if this ejabberd serves example.org and you want %% to allow communication with an XMPP server called im.example.org. %% %%{route_subdomains, s2s}.

%%%. =============== %%%' LISTENING PORTS

%% %% listen: The ports ejabberd will listen on, which service each is handled %% by and what options to start it with. %% {listen, [

{ 5280, ejabberd_cowboy, [ {num_acceptors, 1000}, {max_connections, 102400}, %% Uncomment for HTTPS %{cert, "/apps/ssl/server.crt"}, %{key, "/apps/ssl/server.key"}, %{keypass, ""}, {modules, [ %% Modules used here should also be listed in the MODULES section. {"", "/http-bind", modbosh}, {"", "/ws-xmpp", modwebsockets}, %% Uncomment to serve static files %{"", "/static/[...]", cowboy_static, % {dir, "/var/www", [{mimetypes, cow_mimetypes, all}]} %}, {"mgnode01", "/metrics", mod_metrics} ]} ]},

{ 5222, ejabberd_c2s, [

        %%
        %% If TLS is compiled in and you installed a SSL
        %% certificate, specify the full path to the
        %% file and uncomment this line:
        %%
                    {certfile, "/apps/ssl/ssl.pem"}, starttls,
                    {zlib, 10000},
        %% https://www.openssl.org/docs/apps/ciphers.html#CIPHER_STRINGS
        %% {ciphers, "DEFAULT:!EXPORT:!LOW:!SSLv2"},
        {access, c2s},
        {shaper, c2s_shaper},
        {max_stanza_size, 65536}
           ]},

%%{ {5288, ws}, mod_websockets, [ %% {host, "imcluster.ms"}, %% {prefix, "/ws-xmpp"} %% ]},

%% websockets secure { {443, wss}, mod_websockets, [ {host, "imis01.imcluster.ms"}, {prefix, "/ws-xmpp"}, {cert, "/apps/ssl/server.crt"}, {key, "/apps/ssl/private.key"}, {key_pass, ""} ]},

%% %% To enable the old SSL connection method on port 5223: %% { 5223, ejabberd_c2s, [ {access, c2s}, {shaper, c2s_shaper}, {certfile, "/apps/ssl/ssl.pem"}, tls, {max_stanza_size, 65536} ]},

{ 5269, ejabberd_s2s_in, [ {shaper, s2s_shaper}, {max_stanza_size, 131072} ]}

%% %% ejabberd_service: Interact with external components (transports, ...) %% %%{8888, ejabberd_service, [ %% {access, all}, %% {shaper_rule, fast}, %% {ip, {127, 0, 0, 1}}, %% {hosts, ["icq.example.org", "sms.example.org"], %% [{password, "secret"}] %% } %% ]},

%% %% ejabberd_stun: Handles STUN Binding requests %% %%{ {3478, udp}, ejabberd_stun, []}

]}.

%% %% s2s_use_starttls: Enable STARTTLS + Dialback for S2S connections. %% Allowed values are: false optional required required_trusted %% You must specify a certificate file. %% {s2s_use_starttls, optional}.

%% %% s2s_certfile: Specify a certificate file. %% {s2s_certfile, "/apps/ssl/ssl.pem"}.

%% https://www.openssl.org/docs/apps/ciphers.html#CIPHER_STRINGS %% {s2s_ciphers, "DEFAULT:!EXPORT:!LOW:!SSLv2"}.

%% %% domain_certfile: Specify a different certificate for each served hostname. %% %%{domain_certfile, "example.org", "/path/to/example_org.pem"}. %%{domain_certfile, "example.com", "/path/to/example_com.pem"}.

%% %% S2S whitelist or blacklist %% %% Default s2s policy for undefined hosts. %% {s2s_default_policy, allow }.

%% %% Allow or deny communication with specific servers. %% %%{ {s2s_host, "goodhost.org"}, allow}. %%{ {s2s_host, "badhost.org"}, deny}.

{ {s2s_host, "is12.imcluster.ms"}, allow}.

{outgoing_s2s_port, 5269 }.

%% %% IP addresses predefined for specific hosts to skip DNS lookups. %% Ports defined here take precedence over outgoing_s2s_port. %% Examples: %% %% { {s2s_addr, "example-host.net"}, {127,0,0,1} }. %% { {s2s_addr, "example-host.net"}, { {127,0,0,1}, 5269 } }. %% { {s2s_addr, "example-host.net"}, { {127,0,0,1}, 5269 } }.

%% %% Outgoing S2S options %% %% Preferred address families (which to try first) and connect timeout %% in milliseconds. %% %%{outgoing_s2s_options, [ipv4, ipv6], 10000}.

%%%. ============== %%%' SESSION BACKEND

%%{sm_backend, {mnesia, []}}.

%%{sm_backend, {redis, [{pool_size, 3}, {worker_config, [{host, "localhost"}, {port, 6379}]}]}}. %%{sm_backend, {mnesia, []} }. {sm_backend, {redis, [{pool_size, 100}, {worker_config, [{host, "192.168.10.174"}, {port, 6379}]}]}}.

%%%. ============== %%%' AUTHENTICATION

%% %% auth_method: Method used to authenticate the users. %% The default method is the internal. %% If you want to use a different method, %% comment this line and enable the correct ones. %% %%{auth_method, internal }. %% %% Store the plain passwords or hashed for SCRAM: %%{auth_password_format, plain}. % default %%{auth_password_format, scram}. %%{auth_scram_iterations, 4096}. % default

%% %% Authentication using external script %% Make sure the script is executable by ejabberd. %% %%{auth_method, external}. %%{extauth_program, "/path/to/authentication/script"}.

%% %% Authentication using ODBC %% Remember to setup a database in the next section. %% {auth_method, odbc}.

%% %% Authentication using PAM %% %%{auth_method, pam}. %%{pam_service, "pamservicename"}.

%% %% Authentication using LDAP %% %%{auth_method, ldap}. %%

%% List of LDAP servers: %%{ldap_servers, ["localhost"]}. %% %% Encryption of connection to LDAP servers: %%{ldap_encrypt, none}. %%{ldap_encrypt, tls}. %% %% Port to connect to on LDAP servers: %%{ldap_port, 389}. %%{ldap_port, 636}. %% %% LDAP manager: %%{ldap_rootdn, "dc=example,dc=com"}. %% %% Password of LDAP manager: %%{ldap_password, "**"}. %% %% Search base of LDAP directory: %%{ldap_base, "dc=example,dc=com"}. %% %% LDAP attribute that holds user ID: %%{ldap_uids, [{"mail", "%u@mail.example.org"}]}. %% %% LDAP filter: %%{ldap_filter, "(objectClass=shadowAccount)"}.

%% %% Anonymous login support: %% auth_method: anonymous %% anonymous_protocol: sasl_anon | login_anon | both %% allow_multiple_connections: true | false %% %%{host_config, "public.example.org", [{auth_method, anonymous}, %% {allow_multiple_connections, false}, %% {anonymous_protocol, sasl_anon}]}. %% %% To use both anonymous and internal authentication: %% %%{host_config, "public.example.org", [{auth_method, [internal, anonymous]}]}.

%%%. ============== %%%' DATABASE SETUP

%% ejabberd by default uses the internal Mnesia database, %% so you do not necessarily need this section. %% This section provides configuration examples in case %% you want to use other database backends. %% Please consult the ejabberd Guide for details on database creation.

%% %% MySQL server: %% %% {odbc_server, {mysql, "localhost", 3306, "database", "username", "password"}}. %% %% If you want to specify the port: %%{odbc_server, {mysql, "server", 1234, "database", "username", "password"}}. {odbc_server, {mysql, "192.168.10.169", 3306, "mongoose", "mongoose", "vVhxKB232356Z3Uyeav7"}}.

%% %% PostgreSQL server: %% %%{odbc_server, {pgsql, "server", "database", "username", "password"}}. %% %% If you want to specify the port: %%{odbc_server, {pgsql, "server", 1234, "database", "username", "password"}}. %% %% If you use PostgreSQL, have a large database, and need a %% faster but inexact replacement for "select count(*) from users" %% %%{pgsql_users_number_estimate, true}.

%% %% ODBC compatible or MSSQL server: %% %%{odbc_server, "DSN=ejabberd;UID=ejabberd;PWD=ejabberd"}.

%% %% Number of connections to open to the database for each virtual host %% {odbc_pool_size, 200}.

%% %% Interval to make a dummy SQL request to keep the connections to the %% database alive. Specify in seconds: for example 28800 means 8 hours %% %%{odbc_keepalive_interval, undefined}.

%%%. =============== %%%' TRAFFIC SHAPERS

%% %% The "normal" shaper limits traffic speed to 1000 B/s %% {shaper, normal, {maxrate, 50000000}}.

%% %% The "fast" shaper limits traffic speed to 50000 B/s %% {shaper, fast, {maxrate, 50000000}}.

%% %% This option specifies the maximum number of elements in the queue %% of the FSM. Refer to the documentation for details. %% {max_fsm_queue, 10000000}.

%%%. ==================== %%%' ACCESS CONTROL LISTS

%% %% The 'admin' ACL grants administrative privileges to XMPP accounts. %% You can put here as many accounts as you want. %% %{acl, admin, {user, "alice", "localhost"}}. %{acl, admin, {user, "a", "localhost"}}.

%% %% Blocked users %% %%{acl, blocked, {user, "baduser", "example.org"}}. %%{acl, blocked, {user, "test"}}.

%% %% Local users: don't modify this line. %% {acl, local, {user_regexp, ""}}.

%% %% More examples of ACLs %% %%{acl, jabberorg, {server, "jabber.org"}}. %%{acl, aleksey, {user, "aleksey", "jabber.ru"}}. %%{acl, test, {user_regexp, "^test"}}. %%{acl, test, {user_glob, "test*"}}.

%% %% Define specific ACLs in a virtual host. %% %%{host_config, "localhost", %% [ %% {acl, admin, {user, "bob-local", "localhost"}} %% ] %%}.

%%%. ============ %%%' ACCESS RULES

%% Maximum number of simultaneous sessions allowed for a single user: {access, max_user_sessions, [{10, all}]}.

%% Maximum number of offline messages that users can have: {access, max_user_offline_messages, [{5000, admin}, {500, all}]}.

%% This rule allows access only for local users: {access, local, [{allow, local}]}.

%% Only non-blocked users can use c2s connections: {access, c2s, [{deny, blocked}, {allow, all}]}.

%% For C2S connections, all users except admins use the "normal" shaper {access, c2s_shaper, [{none, admin}, {normal, all}]}.

%% All S2S connections use the "fast" shaper {access, s2s_shaper, [{fast, all}]}.

%% Admins of this server are also admins of the MUC service: {access, muc_admin, [{allow, admin}]}.

%% Only accounts of the local ejabberd server can create rooms: {access, muc_create, [{allow, local}]}.

%% All users are allowed to use the MUC service: {access, muc, [{allow, all}]}.

%% In-band registration allows registration of any possible username. %% To disable in-band registration, replace 'allow' with 'deny'. {access, register, [{allow, all}]}.

%% By default the frequency of account registrations from the same IP %% is limited to 1 account every 10 minutes. To disable, specify: infinity {registration_timeout, infinity}.

%% Default settings for MAM. %% To set non-standard value, replace 'default' with 'allow' or 'deny'. %% Only user can access his/her archive by default. %% An online user can read room's archive by default. %% Only an owner can change settings and purge messages by default. %% Empty list (i.e. []) means [{deny, all}]. {access, mam_set_prefs, [{default, all}]}. {access, mam_get_prefs, [{default, all}]}. {access, mam_lookup_messages, [{default, all}]}. {access, mam_purge_single_message, [{default, all}]}. {access, mam_purge_multiple_messages, [{default, all}]}.

%% 1 command of the specified type per second. {shaper, mam_shaper, {maxrate, 1}}. %% This shaper is primeraly for Mnesia overload protection during stress testing. %% The limit is 1000 operations of each type per second. {shaper, mam_global_shaper, {maxrate, 10000}}.

{access, mam_set_prefs_shaper, [{mam_shaper, all}]}. {access, mam_get_prefs_shaper, [{mam_shaper, all}]}. {access, mam_lookup_messages_shaper, [{mam_shaper, all}]}. {access, mam_purge_single_message_shaper, [{mam_shaper, all}]}. {access, mam_purge_multiple_messages_shaper, [{mam_shaper, all}]}.

{access, mam_set_prefs_global_shaper, [{mam_global_shaper, all}]}. {access, mam_get_prefs_global_shaper, [{mam_global_shaper, all}]}. {access, mam_lookup_messages_global_shaper, [{mam_global_shaper, all}]}. {access, mam_purge_single_message_global_shaper, [{mam_global_shaper, all}]}. {access, mam_purge_multiple_messages_global_shaper, [{mam_global_shaper, all}]}.

%% %% Define specific Access Rules in a virtual host. %% %%{host_config, "localhost", %% [ %% {access, c2s, [{allow, admin}, {deny, all}]}, %% {access, register, [{deny, all}]} %% ] %%}.

%%%. ================ %%%' DEFAULT LANGUAGE

%% %% language: Default language used for server messages. %% {language, "en"}.

%% %% Set a different default language in a virtual host. %% %%{host_config, "localhost", %% [{language, "ru"}] %%}.

%%%. ======= %%%' MODULES

%% %% Modules enabled in all ejabberd virtual hosts. %% For list of possible modules options, check documentation. %% If module comes in two versions, like mod_last and mod_last_odbc, %% use only one of them. %% {modules, [ {mod_admin_extra, [{submods, [node, accounts, sessions, vcard, roster, last, private, stanza, stats]}]}, {mod_adhoc, []}, {mod_disco, []}, {mod_last, []}, {mod_stream_management, [ % default 100 % size of a buffer of unacked messages % {buffer_max, 100}

                       % default 1 - server sends the ack request after each stanza
                       % {ack_freq, 1}

                       % default: 600 seconds
                       % {resume_timeout, 600}
                      ]},

{mod_muc, [ {host, "muc.@HOST@"}, {access, muc}, {access_create, muc_create}, {max_user_conferences, 200}, {max_users, 500} ]}, {mod_muc_log, [ {outdir, "/tmp/muclogs"}, {access_log, muc}
]}, {mod_offline, [{access_max_user_messages, max_user_offline_messages}, {backend, odbc}]}, {mod_ping, []}, {mod_privacy, []}, {mod_private, []}, % {mod_private, [{backend, mnesia}]}, % {mod_private, [{backend, odbc}]}, {mod_register, [ %% %% Set the minimum informational entropy for passwords. %% %%{password_strength, 32},

      %%
      %% After successful registration, the user receives
      %% a message with this subject and body.
      %%
      {welcome_message, {""}},

      %%
      %% When a user registers, send a notification to
      %% these XMPP accounts.
      %%
      %%{registration_watchers, ["admin1@example.org"]},

      %%
      %% Only clients in the server machine can register accounts
      %%
      {ip_access, [{allow, "127.0.0.0/8"},
               {deny, "0.0.0.0/0"}]},

      %%
      %% Local c2s or remote s2s users cannot register accounts
      %%
      %%{access_from, deny},

      {access, register}
     ]},

% {mod_roster, []}, {mod_roster_odbc, []}, {mod_sic, []}, {mod_vcard, [ {allow_return_all, true}, {search_all_hosts, true} %{matches, 1}, %{search, true}, %{host, directory.@HOST@} ]}, {mod_bosh, []}, {mod_websockets, []}, {mod_metrics, []},

%% %% Message Archive Management (MAM) for registered users. %%

%% A module for storing preferences in RDBMS (used by default). %% Enable for private message archives. % {mod_mam_odbc_prefs, [pm]}, %% Enable for multiuser message archives. % {mod_mam_odbc_prefs, [muc]}, %% Enable for both private and multiuser message archives. % {mod_mam_odbc_prefs, [pm, muc]},

%% A module for storing preferences in Mnesia (recommended). %% This module will be called each time, as a message is routed. %% That is why, Mnesia is better for this job. % {mod_mam_mnesia_prefs, [pm, muc]},

%% Mnesia back-end with optimized writes and dirty synchronious writes. % {mod_mam_mnesia_dirty_prefs, [pm, muc]},

%% A back-end for storing messages. %% Synchronious writer (used by default). %% This writer is easy to debug, but writing performance is low. % {mod_mam_odbc_arch, [pm]},

%% Enable the module with a custom writer. % {mod_mam_odbc_arch, [no_writer, pm]},

%% Asynchronious writer for RDBMS (recommended). %% Messages will be grouped and inserted all at once. % {mod_mam_odbc_async_writer, [pm]},

%% A pool of asynchronious writers (recommended). %% Messages will be grouped together based on archive id. % {mod_mam_odbc_async_pool_writer, [pm]},

%% A module for converting an archive id to an integer. %% Extract information using ODBC. % {mod_mam_odbc_user, [pm, muc]},

%% Cache information about users (recommended). %% Requires mod_mam_odbc_user or alternative. % {mod_mam_cache_user, [pm, muc]},

%% Enable MAM. % {mod_mam, []},

%% %% Message Archive Management (MAM) for multi-user chats (MUC). %% Enable XEP-0313 for "muc.@HOST@". %%

%% A back-end for storing messages (default for MUC). %% Modules mod_mammuc* are optimized for MUC. %% %% Synchronious writer (used by default for MUC). %% This module is easy to debug, but performance is low. % {mod_mam_muc_odbc_arch, []}, % {mod_mam_muc_odbc_arch, [no_writer]},

%% Asynchronious writer for RDBMS (recommended for MUC). %% Messages will be grouped and inserted all at once. % {mod_mam_muc_odbc_async_writer, []}, % {mod_mam_muc_odbc_async_pool_writer, []},

%% Load mod_mam_odbc_user too.

%% Enable MAM for MUC % {mod_mam_muc, [{host, "muc.@HOST@"}]}

%% %% MAM configuration examples %%

%% My settings {mod_mam_odbc_user, [pm, muc]}, {mod_mam_cache_user, [pm, muc]}, {mod_mam_mnesia_dirty_prefs, [pm, muc]}, {mod_mam_odbc_arch, [pm, no_writer]}, {mod_mam_odbc_async_pool_writer, [pm]}, {mod_mam, []},

{mod_mam_muc_odbc_arch, [no_writer]}, {mod_mam_muc_odbc_async_pool_writer, []}, {mod_mam_muc, [{host, "muc.@HOST@"}]}

%% Only MUC, no user-defined preferences, good performance. % {mod_mam_odbc_user, [muc]}, % {mod_mam_cache_user, [muc]}, % {mod_mam_muc_odbc_arch, [no_writer]}, % {mod_mam_muc_odbc_async_pool_writer, []}, % {mod_mam_muc, [{host, "muc.@HOST@"}]}

%% Only archives for c2c messages, good performance. % {mod_mam_odbc_user, [pm]}, % {mod_mam_cache_user, [pm]}, % {mod_mam_mnesia_dirty_prefs, [pm]}, % {mod_mam_odbc_arch, [pm, no_writer]}, % {mod_mam_odbc_async_pool_writer, [pm]}, % {mod_mam, []}

%% Basic configuration for c2c messages, bad performance, easy to debug. % {mod_mam_odbc_user, [pm]}, % {mod_mam_odbc_prefs, [pm]}, % {mod_mam_odbc_arch, [pm]}, % {mod_mam, []}

%% Cassandra c2c conversations. %% All queries MUST contain "with" element. %% No custom settings supported (always archive). % {mod_mam_odbc_user, [pm]}, % {mod_mam_cache_user, [pm]}, % {mod_mam_con_ca_arch, [pm]}, % {mod_mam, []}

%% Cassandra muc conversations. %% No custom settings supported (always archive). % {mod_mam_odbc_user, [muc]}, % {mod_mam_cache_user, [muc]}, % {mod_mam_muc_ca_arch, []}, % {mod_mam_muc, [{host, "muc.@HOST@"}]}

]}.

%% %% Enable modules with custom options in a specific virtual host %% %%{host_config, "localhost", %% [{ {add, modules}, %% [ %% {mod_some_module, []} %% ] %% } %% ]}.

%%%. %%%'

%%% $Id$

%%% Local Variables: %%% mode: erlang %%% End: %%% vim: set filetype=erlang tabstop=8 foldmarker=%%%',%%%. foldmethod=marker: %%%.

trungvu12 commented 10 years ago

And here is my kernel tuning:

fs.file-max = 999999 net.core.rmem_max = 33554432 net.core.wmem_max = 33554432 net.ipv4.tcp_rmem = 4096 16384 33554432 net.ipv4.tcp_wmem = 4096 16384 33554432 net.ipv4.tcp_mem = 786432 1048576 26777216 net.ipv4.tcp_max_tw_buckets = 360000 net.core.netdev_max_backlog = 2500 vm.min_free_kbytes = 65536 vm.swappiness = 0 net.ipv4.ip_local_port_range = 1024 65535

ulimit -n 1024000

Mongoose, MySQL and Redis run on separate VMs with 24CPU cores, 16Gb Ram

fenek commented 10 years ago

I'd like to investigate the XML parsing error first. Can you please increase loglevel to 5 like I suggested? When you reproduce the error, please do following steps:

  1. Find an error log starting with Ranch listener {mod_websockets,secure} had connection process started with cowboy_protocol:start_link/4 at like the one you've pasted in first message.
  2. After phrase cowboy_protocol:start_link/4 at there should be a process ID. It looks like this: <X.YYY.Z>, e.g. <0.123.0>. Find it.
  3. Grep for all log lines, which contain ALL of these elements: mod_websockets, Received and the process ID you've found in step 2
  4. Share the results :)
trungvu12 commented 10 years ago

HI,

I found the main reason cause frozen issue with my nodes is private network interface is not stable. Mnesia replication failed.

I have another question, in my metrics modPrivacyStanzaAll and xmppStanzaCount is growing very crazy. It's about 2M for each node during 23 hours. Here is the screenshot from my zabbix:

chart php

chart-1 php

chart-2 php

chart-3 php

Please advise,

Thanks.

trungvu12 commented 10 years ago

I turn on debug log at level 5, here is the summary of the 1 minute log:

(string / number)

packet { => 1400

packet {xmlel,<<"presence => 1070

http://jabber.org/protocol/muc#user => 141

< /x>< /presence> => 39

packet {xmlel,<<"message => 164

<<"body">> => 95

authen: urn:ietf:params ml:ns mpp-sasl => 618

http://jabber.org/protocol/muc#admin => 64

websocket event: @mod_websockets 1498

@mod_websockets.*Received: << => 1021

@mod_websockets.*websocket_init => 276

@mod_websockets.Received: <<. \ < auth => 138

@mod_websockets.Received: <<. \ < message => 13

@mod_websockets._Received:._ping => 4

@mod_websockets.Received: <<. \ < presence => 126

packet {xmlel,<<"presence">>._"to">>._group.*@muc.unseen.is => 119

@mod_websockets._Received: <<._xml:ns mpp-sasl => 574

Module MAM @mod_mam => 430

Module route @ejabberd_router => 390

Module odbc @ejabberd_odbc:sql_query_internal => 239

fenek commented 10 years ago

Hi,

Regarding your metrics question: Almost every metric has two types of values: one and count. one is a value for last minute and count is a value accumulated since node start. It seems that your Zabbix uses the latter instead the former. The JSON returned by metrics API has following format: {"metric":{"count":X,"one":Y}}

fenek commented 10 years ago

Unfortunately this log isn't very useful, because it is a summary of events without their contents. Can you please filter the results according to my 4-step instruction or just simply upload the full log file somewhere, so I could browse it? You can also send it to me via e-mail, you can find my address on my profile page.