mongodb-haskell / mongodb

MongoDB driver for Haskell
http://hackage.haskell.org/package/mongoDB
Apache License 2.0
170 stars 44 forks source link

ReplicaSet & Reused pipes #119

Open acondolu opened 4 years ago

acondolu commented 4 years ago

Hello,

This is more of a question about the current design of the ReplicaSet functions, which don't seem to play well with multi-threading.

Background

Connection.hs supports mongo replica sets by storing the known members in a ReplicaSet object. If the user desires e.g. the replica set primary, the function primary first queries the known members for the address of the primary host; then it returns a Pipe to the primary, if available. However, that function does not actually return a new connection to the primary, but instead it possibly re-uses a previously open connection (see connection function).

Question

My scenario is a multi-threaded application, using a connection pool to distribute connections to server threads. It would be great if I could simply use primary as the resource-creating function of the pool, but this is not possible since subsequent connections taken from the pool would then be exactly the same connection (because of re-use).

This is a blocker, since I can't use any of the ReplicaSet functions. A trivial solution would be to expose an additional function that simply returns the Host, instead of a Pipe, like:

primary :: ReplicaSet -> IO Host

But I was wondering: is there any particular reason for re-using connections in the primary/secondaryOk/routedHost functions? I may be missing somthing, but it looks like this behaviour is good for simple single-threaded apps, but no-good for more realistic multi-threaded servers.

acondolu commented 3 years ago

Hi! If there are no objections I will open a PR solving this issue. Since sharing the connections as it is done now is not thread-safe, I propose to not reuse connections at all, making connection always return a new connection.

Another solution (see here): allow reusing connections, but only by a thread at a time. In the linked commit, I've moved the code calling fetchReplicaInfo inside the scope of the modifyMVar, so that only one thread at a time can update the members of the replica set. The downside is that updating the members then blocks the other threads asking for a connection, so I'm not sure this is a good idea.