mashingan / anonimongo

Another Nim pure Mongo DB driver
MIT License
43 stars 5 forks source link

replicas silently ignored in mongodb://user:pass@host1:port1,host2:port2/ #6

Closed timotheecour closed 4 years ago

timotheecour commented 4 years ago

mongodb+srv not working is reported in https://github.com/mashingan/anonimongo/issues/5; as a workaround, I'm trying to use the corresponding shards by explicitly listing the host/ports in the uri as follows:

mongodb+srv://user:pass@mongoatlas1.xxx.mongodb.net
=>
mongodb://user:pass@mongoatlas1-shard-00-00.xxx.mongodb.net:27017,mongoatlas1-shard-00-01.xxx.mongodb.net:27017,mongoatlas1-shard-00-02.xxx.mongodb.net:27017

however, this seems to silently only take into account the 1st host/port, so that if I do a write operation (eg insert), it fails with:

/Users/timothee/git_clone/nim/anonimongo/src/anonimongo/core/utils.nim(75) crudopsIter
Error: unhandled exception: not master [MongoError]

depending on the order I specify in the host/port list and depending on which of the replicas is the current master.

Given that we can't predict which replica is master (and that can change), this can be a problem.

links

mashingan commented 4 years ago

see https://developer.mongodb.com/article/srv-connection-strings which explains a bit more the relationship between mongodb+srv and the replicas, and how to obtain the replicas from a srv host using a python script

Thanks, will look it later.

maybe related to this README:

Exactly, as I didn't understand how the multihost played role so couldn't implement it. I've started adding the replica set commands (branch replication-db) as I planned to add support for multihost URI connection.

Error: unhandled exception: not master [MongoError]

This is related with multihost indeed. In replication-db branch, I added the command isMaster so the Mongo object could check its connection whether it's master or not.
Write operations can only be done on master connection while read operations is depend on readPreferences.


For implementing multi connections, the internal implementation will need to change to accommodate it.
The current implementation only supports for single connection, it's just a matter of adding a layer above current implementation while keeping the consumed APIs (Mongodb commands) still the same.

timotheecour commented 4 years ago

@mashingan maybe this helps?

(and as you probably already know, srv support in dnsclient.nim requires latest HEAD in https://github.com/ba0f3/dnsclient.nim/commit/bce86b18fa9a2dfecd781d55655a94b8b9c96daa; not yet in a tagged release)

mashingan commented 4 years ago

docs https://docs.mongodb.com/manual/core/read-preference-mechanics/

I've currently reading from server selection spec but this doc page also explains same.

(and as you probably already know, srv support in dnsclient.nim requires latest HEAD in ba0f3/dnsclient.nim@bce86b1; not yet in a tagged release)

Yes, I've tried the seedlist lookup using dnsclient and it worked. I'm currently changing the internal Mongo object to accommodate multiple connections and to support readPreferences when selecting which server to read from. Nimble can choose the github page and the latest HEAD by suffixing @#head I installed the latest HEAD of dnsclient using that method.

mashingan commented 4 years ago

@timotheecour please try again with fix-ssl-connections as I've rebased this branch to replicate-db branch.
This also supports for multihost connections and I overloaded some of newMongo signature by its argument i.e:

let atlasuri = "mongodb+srv://user:pw@some-shard-cluster-domain.mongodb.com/test?ssl=true"
var mongo = newMongo(uri = atlasuri, poolconn = 2)

# the rest is same

ReadPreferences other than ReadPreferences.primary is still not supported and will throw error. By default the snipped above already set the ReadPreferences to ReadPreferences.primary.
I'll gradually add other ReadPreferences for later.

timotheecour commented 4 years ago

thanks for the update, here's what I tried:

gives: Error: unhandled exception: value out of range: 8027506285661089891 notin -2147483648 .. 2147483647 [RangeDefect]

(see below for full context)

note

when you throw (eg inside newMongo), using raise newException looses all the context of where it was thrown; if instead you do:

    except:
      raise newException(MongoError, &"Other error when sending query: {getCurrentExceptionMsg()}", e)
=>
    except Exception as e:
      raise e

then it shows the proper context; I'm getting:

/Users/timothee/git_clone/nim/timn/tests/nim/all/t11095.nim(74) t11095
/Users/timothee/git_clone/nim/anonimongo/src/anonimongo/core/types.nim(297) newMongo
/Users/timothee/git_clone/nim/dnsclient.nim/src/dnsclient.nim(41) sendQuery
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/net.nim(1559) sendTo
/Users/timothee/git_clone/nim/Nim_devel/lib/system/fatal.nim(49) sysFatal
[[reraised from:
/Users/timothee/git_clone/nim/timn/tests/nim/all/t11095.nim(74) t11095
/Users/timothee/git_clone/nim/anonimongo/src/anonimongo/core/types.nim(315) newMongo
]]
Error: unhandled exception: value out of range: 8027506285661089891 notin -2147483648 .. 2147483647 [RangeDefect]

note 2

I also tried:

except Exception as e:
  raise newException(MongoError, "extra stuff", e)

to turn exception into a MongoError but that also squashed the stacktrace (i thought this should've worked); maybe there's another way to intercept an exception E1, turn it into E2 with some additional msg, yet keep the original stacktrace, not sure yet.

EDIT

in dnsclient.nim, right before

  c.socket.sendTo(c.server, c.port, addr data, bufLen)

it looks like bufLen = 8027506285661089891

EDIT

/cc @ba0f3 this code looks suspicious in dnsclient.nim:

discard buf.readData(addr data, bufLen)

given that data is a string, and addr(data) doesn't point to 1st element of string but instead to address of that string (implementation defined, and could maybe depend on which gc is used, not sure)

timotheecour commented 4 years ago

yay! I'm fixing the above mentioned issue in https://github.com/ba0f3/dnsclient.nim/pull/2, after that, mongodb+srv seems to work! still need to run more checks though, let's keep this issue open until it all works

what i don't understand is how you got it to work in your setup without https://github.com/ba0f3/dnsclient.nim/pull/2 ? which version of nim were you using?

mashingan commented 4 years ago

when you throw (eg inside newMongo), using raise newException looses all the context of where it was thrown;

Ah, yeah, I thought the stacktraces were appended to the new exception type.

what i don't understand is how you got it to work in your setup without ba0f3/dnsclient.nim#2 ? which version of nim were you using?

I checked in Debian and Windows and both never threw that error.
In my specific case, I did a bit different by adding the dnsserver argument to "192.168.18.1" (ISP intranet dns server) because all other dns servers are failed. Hence I need to add the dnsserver argument in newMongo signature to solve that.

nim -v

Nim Compiler Version 1.2.0 [Windows: amd64]
Compiled at 2020-04-03
Copyright (c) 2006-2020 by Andreas Rumpf

git hash: 7e83adff84be5d0c401a213eccb61e321a3fb1ff
active boot switches: -d:release

and

Nim Compiler Version 1.2.0 [Linux: amd64]
Compiled at 2020-04-03
Copyright (c) 2006-2020 by Andreas Rumpf

git hash: 7e83adff84be5d0c401a213eccb61e321a3fb1ff
active boot switches: -d:release

still need to run more checks though, let's keep this issue open until it all works

Yeah, I need to add more guards to the passed multihost url too.

timotheecour commented 4 years ago

does it also work with ba0f3/dnsclient.nim#2 ?

mashingan commented 4 years ago

Tried the patch and it worked no problem.

timotheecour commented 4 years ago

Ah, yeah, I thought the stacktraces were appended to the new exception type.

I've reported it here: https://github.com/nim-lang/Nim/issues/15182

Yeah, I need to add more guards to the passed multihost url too.

somehow mongo+srv seems to work for certain mongo versions of mongo atlas clusters only: works: Version 4.2.8, 4.4.0 doesn't work: 4.0.19, Version 3.6.19

is that your experience as well?

I overloaded some of newMongo signature by its argument

not a huge deal but this causes ambiguities in following case: var mongo = newMongo(uriserver, poolconn = 16)

Error: ambiguous call; both types.newMongo(uri: string, poolconn: int, dnsserver: string) [declared in /Users/timothee/git_clone/nim/anonimongo/src/anonimongo/core/types.nim(273, 6)] and types.newMongo(host: string, port: int, master: bool, poolconn: int, sslinfo: SslInfo) [declared in /Users/timothee/git_clone/nim/anonimongo/src/anonimongo/core/types.nim(253, 6)] match for: (string, int literal(16))

(workaround: use uri = uriserver)

mashingan commented 4 years ago

somehow mongo+srv seems to work for certain mongo versions of mongo atlas clusters only: works: Version 4.2.8, 4.4.0 doesn't work: 4.0.19, Version 3.6.19

Could you copy and paste (or see) that particular version cluster logs? Unfortunately the free tier is locked to version 4.2.
Or maybe just the error message from the snippet code.

I added support for all readPreferences except nearest and in my experience, when using other than readPreferences.primary it will need the connection to be set slaveOk.

not a huge deal but this causes ambiguities in following case: var mongo = newMongo(uriserver, poolconn = 16)

Yes, I overloaded by string argument because I didn't want to change the existing single/direct connection which has string argument host = localhost. I did differ between URI connection and direct string connection, but for multihost, couldn't rely on uri.URI type anymore.
Thought of changing the function name like newMongoUri, or maybe adding a wrapper type like

type MultiHost* = object
    uri*: string

proc newMongo*(mhost: MultiHost ....): Mongo =
   # snip for the rest

Haven't decided which to pick, but for now it's overloaded argument, perhaps later if it's hard to differentiate and prone to be mistaken, I'll change to the wrapper type.


I just realized, I'll just go with distinct string.

mashingan commented 4 years ago

As fix-ssl-connections is merged into develop branch, please pull and checkout to develop HEAD. After some few adjustment, especially the readme, I hope I'll be able to merge into master around next week.

timotheecour commented 4 years ago

please pull and checkout to develop HEAD

done, still works, feel free to remove fix-ssl-connections

timotheecour commented 4 years ago

Haven't decided which to pick, but for now it's overloaded argument, perhaps later if it's hard to differentiate and prone to be mistaken, I'll change to the wrapper type. I just realized, I'll just go with distinct string.

what about a bool argument?

proc newMongo*(url: string, isMultiHost = false): Mongo = ...
timotheecour commented 4 years ago

Could you copy and paste (or see) that particular version cluster logs?

nothing interesting in logs obtained via: download logs, looks like MongoDB Automation Agent is pinging the server every 20s-ish

all i'm seeing is nim stacktrace

/Users/timothee/git_clone/nim/timn/tests/nim/all/t11095.nim(258) t11095
/Users/timothee/git_clone/nim/timn/tests/nim/all/t11095.nim(249) main
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/asyncdispatch.nim(1932) waitFor
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/asyncdispatch.nim(1624) poll
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/asyncdispatch.nim(1365) runOnce
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/asyncdispatch.nim(208) processPendingCallbacks
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/asyncmacro.nim(23) sendOpsNimAsyncContinue
/Users/timothee/git_clone/nim/anonimongo/src/anonimongo/core/utils.nim(42) sendOpsIter
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/asyncmacro.nim(248) getReply
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/asyncmacro.nim(26) getReplyNimAsyncContinue
/Users/timothee/git_clone/nim/anonimongo/src/anonimongo/core/wire.nim(172) getReplyIter
/Users/timothee/git_clone/nim/anonimongo/src/anonimongo/core/bson.nim(103) msgHeaderFetch
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/streams.nim(620) readInt32
/Users/timothee/git_clone/nim/Nim_devel/lib/pure/streams.nim(411) read
mashingan commented 4 years ago

what about a bool argument?

I prefer the distinct type as the before user would need to do parseUri(theUrl) but with distinct string, MultiUri, would just only need to change to MultiUri(theUrl) .

Now I think again, I'll change the name to MongoUri as MultiUri supports also single uri (the standard) and the Mongodb specific multi host-domain.


nothing interesting in logs obtained via

Looking from the stacktrace, it has something to do with socket and/or the connection. My guest would be in some version the cluster dropping the connection and Anonimongo doesn't check whether the connection has closed and doesn't try to reconnect if it's so.
But here's the catch, the stack error should be stopping only in sendops -> socket.send only, not in getReply in case I mentioned above, which means socket had successfully sent something and was expecting reply from the server.

As this would expand the scope from the issue label, I'll close this in the next tagged commit version in master branch, but feel free to open new issue for tackling this Mongodb server behavior.