Closed miles2smiles closed 5 years ago
Thank you for your time.
Team RabbitMQ uses GitHub issues for specific actionable items engineers can work on. GitHub issues are not used for questions, investigations, root cause analysis, discussions of potential issues, etc (as defined by this team).
We get at least a dozen of questions through various venues every single day, often light on details. At that rate GitHub issues can very quickly turn into a something impossible to navigate and make sense of even for our team. Because GitHub is a tool our team uses heavily nearly every day, the signal/noise ratio of issues is something we care about a lot.
Please post this to rabbitmq-users.
Thank you.
RabbitMQ does not keep crashing. "Crash" here refers to an unhandled exception. It happens in the node memory monitor because it tries to start a subprocess and that fails with an emfile
("too many open files"). This is a kernel limit that RabbitMQ does not control. See Open File Handle Limit in the docs. Production Checklist recommends 30K minimum vs. the default 1024 (on Linux).
In 3.7.7 it is not really necessary to use subprocesses in the memory monitor. The allocated
strategy doesn't use any and is very close in precision to the rss
one which your node seems to be configured to use.
HI I am new to rabbitmq and having issue keeping the service running.
I recently upgraded RabbitMQ 3.7.7 Erlang 20.2 at the same time developers also made some changes to the code, and now I am unable to identify what could've caused the issue.
I noticed that after sometime RabbitMQ-Server stops responding and gives 500 Error.
Below are the Details
RabbitMQ 3.7.7 Erlang 20.2 Client Library : NodeJS AMQPLib OS: CentOS Linux release 7.4.1708 (Core)
uname -a
Linux dev-rabbitmq 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Plugins
rabbitmq-plugins list
Configured: E = explicitly enabled; e = implicitly enabled | Status: = running on rabbit@dev-rabbitmq |/ [ ] rabbitmq_amqp1_0 3.7.7 [ ] rabbitmq_auth_backend_cache 3.7.7 [ ] rabbitmq_auth_backend_http 3.7.7 [ ] rabbitmq_auth_backend_ldap 3.7.7 [ ] rabbitmq_auth_mechanism_ssl 3.7.7 [ ] rabbitmq_consistent_hash_exchange 3.7.7 [ ] rabbitmq_event_exchange 3.7.7 [ ] rabbitmq_federation 3.7.7 [ ] rabbitmq_federation_management 3.7.7 [ ] rabbitmq_jms_topic_exchange 3.7.7 [E] rabbitmq_management 3.7.7 [e] rabbitmq_management_agent 3.7.7 [ ] rabbitmq_mqtt 3.7.7 [ ] rabbitmq_peer_discovery_aws 3.7.7 [ ] rabbitmq_peer_discovery_common 3.7.7 [ ] rabbitmq_peer_discovery_consul 3.7.7 [ ] rabbitmq_peer_discovery_etcd 3.7.7 [ ] rabbitmq_peer_discovery_k8s 3.7.7 [ ] rabbitmq_random_exchange 3.7.7 [ ] rabbitmq_recent_history_exchange 3.7.7 [ ] rabbitmq_sharding 3.7.7 [ ] rabbitmq_shovel 3.7.7 [ ] rabbitmq_shovel_management 3.7.7 [ ] rabbitmq_stomp 3.7.7 [ ] rabbitmq_top 3.7.7 [ ] rabbitmq_tracing 3.7.7 [ ] rabbitmq_trust_store 3.7.7 [e] rabbitmq_web_dispatch 3.7.7 [ ] rabbitmq_web_mqtt 3.7.7 [ ] rabbitmq_web_mqtt_examples 3.7.7 [ ] rabbitmq_web_stomp 3.7.7 [ ] rabbitmq_web_stomp_examples 3.7.7
CRASH LOG:
c/rabbit_mgmt_db.erl"},{line,363}]},{rabbit_mgmt_db,list_queue_stats,3,[{file,"src/rabbit_mgmt_db.erl"},{line,360}]},{timer,tc,2,[{file,"timer.erl"},{line,181}]},{rabbit_mgmt_db_cache,handle_call,3,[{fi le,"src/rabbit_mgmt_db_cache.erl"},{line,107}]},{gen_server,try_handle_call,4,[{file,"gen..."},...]},...]} in context child_terminated 2018-08-13 08:56:07.195 [error] <0.1607.8> Generic server <0.1607.8> terminating Last message in was {submit,#Fun,<0.1618.8>,reuse}
When Server state == {from,<0.1618.8>,#Ref<0.1004433305.138936321.216641>}
Reason for termination ==
{{badkey,'rabbit@oneapp-rabbitmq'},[{maps,get,['rabbit@oneapp-rabbitmq',#{}],[]},{rabbit_mgmt_db,'-node_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,596}]},{worker_pool_worker,handle_call,3,[{file,"src/worker_pool_worker.erl"},{line,105}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1026}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,257}]}]}
2018-08-13 08:56:07.195 [error] <0.1607.8> CRASH REPORT Process <0.1607.8> with 0 neighbours exited with reason: {{badkey,'rabbit@oneapp-rabbitmq'},[{maps,get,['rabbit@oneapp-rabbitmq',#{}],[]},{rabbit_mgmt_db,'-node_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,596}]},{worker_pool_worker,handle_call,3,[{file,"src/worker_pool_worker.erl"},{line,105}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1026}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,257}]}]} in gen_server2:terminate/3 line 1161
2018-08-13 08:56:07.195 [error] <0.674.0> Supervisor management_worker_pool_sup had child 3 started with worker_pool_worker:start_link(management_worker_pool) at <0.1607.8> exit with reason {{badkey,'rabbit@oneapp-rabbitmq'},[{maps,get,['rabbit@oneapp-rabbitmq',#{}],[]},{rabbit_mgmt_db,'-node_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,596}]},{worker_pool_worker,handle_call,3,[{file,"src/worker_pool_worker.erl"},{line,105}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1026}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,257}]}]} in context child_terminated
2018-08-13 08:56:07.196 [error] <0.1618.8> CRASH REPORT Process <0.1618.8> with 0 neighbours exited with reason: {{{badkey,'rabbit@oneapp-rabbitmq'},[{maps,get,['rabbit@oneapp-rabbitmq',#{}],[]},{rabbit_mgmt_db,'-node_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,596}]},{worker_pool_worker,handle_call,3,[{file,"src/worker_pool_worker.erl"},{line,105}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1026}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,257}]}]},{gen_server2,call,[<0.1607.8>,{submit,#Fun,<0.1618.8>,reuse},infinity]}} in gen_server2:call/3 line 327 in gen_server2:call/3 line 327
2018-08-13 08:56:07.196 [error] <0.1072.8> Ranch listener rabbit_web_dispatch_sup_15672, connection process <0.1072.8>, stream 93 had its request process <0.1618.8> exit with reason {{{badkey,'rabbit@oneapp-rabbitmq'},[{maps,get,['rabbit@oneapp-rabbitmq',#{}],[]},{rabbit_mgmt_db,'-node_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,596}]},{worker_pool_worker,handle_call,3,[{file,"src/worker_pool_worker.erl"},{line,105}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1026}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,257}]}]},{gen_server2,call,[<0.1607.8>,{submit,#Fun,<0.1618.8>,reuse},infinity]}} and stacktrace [{gen_server2,call,3,[{file,"src/gen_server2.erl"},{line,327}]},{rabbit_mgmt_wm_nodes,to_json,2,[{file,"src/rabbit_mgmt_wm_nodes.erl"},{line,39}]},{cowboy_rest,call,3,[{file,"src/cowboy_rest.erl"},{line,1128}]},{cowboy_rest,set_resp_body,2,[{file,"src/cowboy_rest.erl"},{line,1019}]},{cowboy_rest,upgrade,4,[{file,"src/cowboy_rest.erl"},{line,238}]},{cowboy_stream_h,execute,3,[{file,"src/cowboy_stream_h.erl"},{line,205}]},{cowboy_stream_h,request_process,3,[{file,"src/cowboy_stream_h.erl"},{line,184}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]
2018-08-13 08:56:17.180 [error] <0.1617.8> Generic server rabbit_mgmt_db_cache_queues terminating
Last message in was {fetch,#Fun,[[[{name,<<"Notifications">>},{vhost,<<"reach_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.323.0>},{state,live}],[{name,<<"Logs">>},{vhost,<<"reach_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.326.0>},{state,live}],[{name,<<"pushNotifications">>},{vhost,<<"oaop_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.380.0>},{state,live}],[{name,<<"Notifications">>},{vhost,<<"oaop_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.383.0>},{state,live}],[{name,<<"ProfileBuilder">>},{vhost,...},...],...]]}
When Server state == {state,none,[],undefined,5}
Reason for termination ==
{{badkey,{resource,<<"reach_fednet">>,queue,<<"Notifications">>}},[{maps,get,[{resource,<<"reach_fednet">>,queue,<<"Notifications">>},#{}],[]},{rabbit_mgmt_db,'-list_queue_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,363}]},{rabbit_mgmt_db,list_queue_stats,3,[{file,"src/rabbit_mgmt_db.erl"},{line,360}]},{timer,tc,2,[{file,"timer.erl"},{line,181}]},{rabbit_mgmt_db_cache,handle_call,3,[{file,"src/rabbit_mgmt_db_cache.erl"},{line,107}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,636}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,665}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}
Client <0.1623.8> stacktrace
[{gen,do_call,4,[{file,"gen.erl"},{line,169}]},{gen_server,call,3,[{file,"gen_server.erl"},{line,210}]},{rabbit_mgmt_db,submit_cached,4,[{file,"src/rabbit_mgmt_db.erl"},{line,714}]},{rabbit_mgmt_util,augment,2,[{file,"src/rabbit_mgmt_util.erl"},{line,423}]},{rabbit_mgmt_util,run_augmentation,2,[{file,"src/rabbit_mgmt_util.erl"},{line,400}]},{rabbit_mgmt_util,augment_resources0,6,[{file,"src/rabbit_mgmt_util.erl"},{line,389}]},{rabbit_mgmt_util,with_valid_pagination,3,[{file,"src/rabbit_mgmt_util.erl"},{line,313}]},{rabbit_mgmt_wm_queues,to_json,2,[{file,"src/rabbit_mgmt_wm_queues.erl"},{line,52}]}]
2018-08-13 08:56:17.181 [error] <0.1617.8> CRASH REPORT Process rabbit_mgmt_db_cache_queues with 0 neighbours crashed with reason: {{badkey,{resource,<<"reach_fednet">>,queue,<<"Notifications">>}},[{maps,get,[{resource,<<"reach_fednet">>,queue,<<"Notifications">>},#{}],[]},{rabbit_mgmt_db,'-list_queue_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,363}]},{rabbit_mgmt_db,list_queue_stats,3,[{file,"src/rabbit_mgmt_db.erl"},{line,360}]},{timer,tc,2,[{file,"timer.erl"},{line,181}]},{rabbit_mgmt_db_cache,handle_call,3,[{file,"src/rabbit_mgmt_db_cache.erl"},{line,107}]},{gen_server,try_handle_call,4,[{file,"gen..."},...]},...]}
2018-08-13 08:56:17.181 [error] <0.679.0> Supervisor rabbit_mgmt_db_cache_sup had child rabbit_mgmt_db_cache_queues started with rabbit_mgmt_db_cache:start_link(queues) at <0.1617.8> exit with reason {{badkey,{resource,<<"reach_fednet">>,queue,<<"Notifications">>}},[{maps,get,[{resource,<<"reach_fednet">>,queue,<<"Notifications">>},#{}],[]},{rabbit_mgmt_db,'-list_queue_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,363}]},{rabbit_mgmt_db,list_queue_stats,3,[{file,"src/rabbit_mgmt_db.erl"},{line,360}]},{timer,tc,2,[{file,"timer.erl"},{line,181}]},{rabbit_mgmt_db_cache,handle_call,3,[{file,"src/rabbit_mgmt_db_cache.erl"},{line,107}]},{gen_server,try_handle_call,4,[{file,"gen..."},...]},...]} in context child_terminated
2018-08-13 08:56:17.181 [error] <0.1623.8> CRASH REPORT Process <0.1623.8> with 0 neighbours exited with reason: {{{badkey,{resource,<<"reach_fednet">>,queue,<<"Notifications">>}},[{maps,get,[{resource,<<"reach_fednet">>,queue,<<"Notifications">>},#{}],[]},{rabbit_mgmt_db,'-list_queue_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,363}]},{rabbit_mgmt_db,list_queue_stats,3,[{file,"src/rabbit_mgmt_db.erl"},{line,360}]},{timer,tc,2,[{file,"timer.erl"},{line,181}]},{rabbit_mgmt_db_cache,handle_call,3,[{file,"src/rabbit_mgmt_db_cache.erl"},{line,107}]},{gen_server,try_handle_call,4,[{file,"g..."},...]},...]},...} in gen_server:call/3 line 214 in gen_server:call/3 line 214
2018-08-13 08:56:17.182 [error] <0.1069.8> Ranch listener rabbit_web_dispatch_sup_15672, connection process <0.1069.8>, stream 94 had its request process <0.1623.8> exit with reason {{{badkey,{resource,<<"reach_fednet">>,queue,<<"Notifications">>}},[{maps,get,[{resource,<<"reach_fednet">>,queue,<<"Notifications">>},#{}],[]},{rabbit_mgmt_db,'-list_queue_stats/3-lc$^1/1-1-',4,[{file,"src/rabbit_mgmt_db.erl"},{line,363}]},{rabbit_mgmt_db,list_queue_stats,3,[{file,"src/rabbit_mgmt_db.erl"},{line,360}]},{timer,tc,2,[{file,"timer.erl"},{line,181}]},{rabbit_mgmt_db_cache,handle_call,3,[{file,"src/rabbit_mgmt_db_cache.erl"},{line,107}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,636}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,665}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]},{gen_server,call,[<0.1617.8>,{fetch,#Fun,[[[{name,<<"Notifications">>},{vhost,<<"reach_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.323.0>},{state,live}],[{name,<<"Logs">>},{vhost,<<"reach_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.326.0>},{state,live}],[{name,<<"pushNotifications">>},{vhost,<<"oaop_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.380.0>},{state,live}],[{name,<<"Notifications">>},{vhost,<<"oaop_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.383.0>},{state,live}],[{name,<<"ProfileBuilder">>},{vhost,<<"oaop_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.386.0>},{state,live}],[{name,<<"smsNotifications">>},{vhost,<<"oaop_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.389.0>},{state,live}],[{name,<<"AttributesUpdate">>},{vhost,<<"oaop_fednet">>},{durable,true},{auto_delete,false},{exclusive,false},{owner_pid,none},{arguments,#{}},{pid,<0.392.0>},{state,live}],...]]},...]}} and stacktrace [{gen_server,call,3,[{file,"gen_server.erl"},{line,214}]},{rabbit_mgmt_db,submit_cached,4,[{file,"src/rabbit_mgmt_db.erl"},{line,714}]},{rabbit_mgmt_util,augment,2,[{file,"src/rabbit_mgmt_util.erl"},{line,423}]},{rabbit_mgmt_util,run_augmentation,2,[{file,"src/rabbit_mgmt_util.erl"},{line,400}]},{rabbit_mgmt_util,augment_resources0,6,[{file,"src/rabbit_mgmt_util.erl"},{line,389}]},{rabbit_mgmt_util,with_valid_pagination,3,[{file,"src/rabbit_mgmt_util.erl"},{line,313}]},{rabbit_mgmt_wm_queues,to_json,2,[{file,"src/rabbit_mgmt_wm_queues.erl"},{line,52}]},{cowboy_rest,call,3,[{file,"src/cowboy_rest.erl"},{line,1128}]}]
2018-08-13 08:56:17.199 [error] <0.1611.8> ** Generic server <0.1611.8> terminating