ErlyORM / boss_db

BossDB: a sharded, caching, pooling, evented ORM for Erlang
Other
277 stars 138 forks source link

out of control when some db shards are down #225

Closed liuzhen closed 8 years ago

liuzhen commented 9 years ago

I was using a main db config with db_shard_0, and assumed when db_shard_0 is down, main db still works.

But the os running main db seemed receiving endless tcp connecting (very high cpu sys usage), check out the log:

2015-05-30 17:51:10.341 [error] <0.6259.3> gen_server <0.6259.3> terminated with reason: {{{badmatch,{error,econnrefused}},[{pgsql_sock,command,2,[{file,"src/pgsql_sock.erl"},{line,163}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,607}]},{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,639}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]},{gen_server,call,[<0.6267.3>,{connect,"","","",[{port,5432},{database,""}]},infinity]}} in gen_server:call/3 line 190 2015-05-30 17:51:10.342 [error] <0.6259.3> CRASH REPORT Process <0.6259.3> with 1 neighbours exited with reason: {{{badmatch,{error,econnrefused}},[{pgsql_sock,command,2,[{file,"src/pgsql_sock.erl"},{line,163}]},{gen_server,try_handle_call,4,[{file,"gen_server.erl"},{line,607}]},{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,639}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]},{gen_server,call,[<0.6267.3>,{connect,"","","",[{port,5432},{database,""}]},infinity]}} in gen_server:terminate/7 line 804

Is it a bug?

liuzhen commented 9 years ago

boss_db_controller.erl:

handle_cast({try_connect, Options}, State) when State#state.connection_state /= connected -> Adapter = State#state.adapter, CacheEnable = State#state.cache_enable, CacheTTL = State#state.cache_ttl, try case connections_for_adapter(Adapter, Options) of {ok, {ReadConn, WriteConn}} -> {Shards, ModelDict} = make_shards(Options, Adapter), {noreply, #state{connection_state = connected, connection_delay = 1, adapter = Adapter, read_connection = ReadConn, write_connection = WriteConn, shards = lists:reverse(Shards), model_dict = ModelDict, options = Options, cache_enable = CacheEnable, cache_ttl = CacheTTL, cache_prefix = db }}; _Failure -> reconnect_noreply(Options, State, Adapter, CacheEnable, CacheTTL) end catch :_Error -> reconnect_no_reply(Options, State, Adapter, CacheEnable, CacheTTL) end;

I "try-catch" all the db_connecting including main db and db_shards, then reconnect_no_reply works.