Open mogadanez opened 7 years ago
I have timeout 20 sec
How did you set this timeout?
this queries stale for this timeout and not try to recover according to [executionAttempts=3] setting
I don't understand this statement. Are you saying that the query did re-try in your scenario, but you would expect it not to re-try?
Error: Connection closed while executing command
This is caused by Riak closing the connection, or a proxy between your application closing the connection.
How did you set this timeout? I was wrong, this timeout used only for SOLR request which we currently done with plain http Node constructed and added to cluster
makeNode(){ var nodeBuilder = new Riak.Node.Builder().withRemoteAddress(host).withRemotePort(riakCfg.pbPort); }
new Riak.Client( new Riak.Cluster({nodes: currentNodesList.map(x=> makeNode( x ) )})
so default timeouts is used I assume
> I don't understand this statement. Are you saying that the query did re-try in your scenario, but you would expect it not to re-try?
No, it is _not_ retry, but I expect that it will
I'm listening on node event, and log retries
node.on("retryCommand", (command, node)=>{ makeLogger( ).warn( {command: _.pick( command, "PbRequestName", "remainingTries", "name" ) , bucket: command.options.bucket , indexName: command.optionsindexName , bucketType: command.options.bucketType , key: String ( command.options.key ) , riakNode: formatRiakNode(node) }, "Command failed " + (command.options.indexName ? "(" + command.options.indexName + ")" : "") + ", Retry (" + command.remainingTries + " remaining)" ) }
but for this command it is not logged.
>Are you using a proxy, like HAProxy?
How often do you see this error?
not using proxy. often occurs more In peak time, in total about ~300-400 per day.
@mogadanez thanks for all of your information. I don't know why the command is not re-trying in your situation. I will keep looking into that. If you could log command.remainingTries
in your error handler that would be helpful.
What is happening is that, at peak time, the additional cluster load causes secondary index queries to exceed the client-side request timeout of 5 seconds (timeout is set here).
You can see this in your stack trace (I didn't notice it the first time I looked):
at RiakConnection._connHandleTimeout (/var/proofme/node_modules/basho-riak-client/lib/core/riakconnection.js:125:14)
at Socket.g (events.js:291:16)
at emitNone (events.js:86:13)
at Socket.emit (events.js:185:7)
at Socket._onTimeout (net.js:339:8)
This means that the command took longer than the default of 5 seconds to execute. The internal timeout in Riak for a secondary index query is 60 seconds, so you could use the requestTimeout
option to set the request timeout to at least 60 seconds.
Note that change affects every request to Riak. Another option would be to increase requestTimeout
to a larger value, like 30 seconds, on every node, and set the timeout
option on the SecondaryIndexQuery
command to 29750
(docs).
This means that Riak will spend a maximum of 29750
milliseconds executing the secondary index query before timing out and sending a timeout
response back to your application. Having this value slightly less than the requestTimeout
makes it more likely the command won't error due to a client-side timeout.
You will have to experiment to find the best requestTimeout
value to allow secondary index queries to run to completion during peak time.
[x] Riak Node.js Client version 2.4
[x] Node.js version 6.5.0
[x] Riak version 2.14
[x] Operating System / Distribution & Version Debian Jessie stable
[x] Riak
error.log
file, if applicable[x] What commands were being executed during the error
got Error:
[x] What you expected to have happen I have timeout 20 sec. this queries stale for this timeout and not try to recover according to [executionAttempts=3] setting