Open qix opened 8 years ago
minConnections
to at least 500, with an appropriately higher maxConnections
value. Please be sure that you have tuned your servers according to our recommendations (I assume you are using Linux). Linux or FreeBSD should have no issue handling this many connections.maxConnections
changed to 128
in this commit but I overlooked changing the documentation (see #122)
We've been using the Riak in production for around six months now and have had a ton of issues, primarily around connection handling.
Our system usually requires values stored in riak in a very burst-y fashion, requesting up to +/- 500 keys within a few milliseconds. We were shocked to realize that each get request required it's own connection, and that the defaults maxConnections was set to
10000
. This essentially means whenever we requested the keys each key would open it's own connection and overwhelm the riak servers [keep in mind this is happening on 50-100 boxes at similar times.]In an ideal world we would open a connection to each riak server, send all the commands down them in round-robin fashion and then wait for responses. I understand the protocol requires a roundtrip for every request right now which is its own problem -- I'm not sure if there is anything in the pipeline to solve that.
The logic in the
queueCommands
is useless to us as it would either create a whole lot of cpu load creating 500 timeouts repeatedly. It also takes (N [requests] / M [connections]) * T [queueSubmitInterval] time. With twenty connections our five hundred gets would take 10+ seconds to fetch, and that's ignoring the speed/latency of the actual riak servers. Yes we could drop queueSubmitInterval, but dropping it low then causes a ton of cpu burn creating useless timers.I know this was a bunch of complaints and not much in the line of solutions... we're actually looking at switching datastore for our simpler "write once" key-value requests which will alleviate most of the load. As a stop-gap we've implemented a super simple
RiakCluster
on our end which creates a bunch of RiakClient's and load balancers them properly.Some suggestions that would help a ton:
100
?