No reconnection for mongodb_topology_pool

comtihon / mongodb-erlang

MongoDB driver for Erlang

Apache License 2.0

342 stars 267 forks source link

No reconnection for mongodb_topology_pool #156

Open johnzeng opened 7 years ago

johnzeng commented 7 years ago

I use mongodb_topology_pool to pool my mongodb's request, but I found out that if the mongodb is shutdown, the monogo_topology_pool will have no ability to reconnect and so I have to restart the whole application.

run the application with mongodb sdk
stop the monogdb
requests will stop in mongodb operation
restart mongodb, no reconnection.

I have tried to figure out what's wrong and looks like the mc_pool_sup is always down after the workers get some connection error.

comtihon commented 7 years ago

Sounds like a bug.

johnzeng commented 7 years ago

I think the problem is that the supervisor is trying too fast to reconnect before the mongodb is restarted, it retried over the intensity during the period so the supervisor itself is terminated

comtihon commented 7 years ago

Yes. I noticed that too, when testing this issue. I am now thinking about refactoring the mongoc module, as it is very hard for me to dive into this code. After the refactoring I think this error will be solved. But if it is critical for you - you can make a pr to master with old code.

johnzeng commented 7 years ago

No it's not critical, I happened to find this when I was switching my mongodb server. It won't happen frequently. I can wait for refactoring and I think that will be better.

jessa0 commented 7 years ago

the only thing that starts mc_pool_sup is mongoc:connect/3, which ends up doing mc_pool_sup:start_link/0 in the calling process; so if you happen to call mongoc:connect/3 in a process that ignores it, like i did (e.g. a supervisor, which always ignores 'EXIT' from non-children), then nothing will restart it. there are a few possible solutions i can think of: 1) return the mc_pool_sup pid from mongoc:connect/3 if it had to be started, but then this would change its return type; 2) remove the call to mc_pool_sup:ensure_started/0 from mongoc:connect/3, and add to documentation that you must start mc_pool_sup before using mongoc; 3) ditto, but just start mc_pool_sup from mc_super_sup; 4) use supervisor:start_child(mc_super_sup, ...) in mc_pool_sup:ensure_started/0 instead of calling start_link/0 directly.