Closed jbt closed 8 years ago
@jbt what commands and what options do you use? And please activate the debug mode to see what happens inside node_redis.
I've been trying to narrow in on a base case to figure out whether the bug is here in redis-clustr, in node_redis, or in redis-parser. It's proving to be a little tricky to reduce down but as soon as I've got a better idea of exactly what's causing the problems then I'll open issues in the relevant places
Slowly beginning to track this down. It looks like the error is coming most often from a cluster slots
command. More information as and when I can find it out.
Putting here mostly for my own reference - I'll pick this up tomorrow and open some proper issues:
It seems not to be an issue with redis-clustr. The following test case seems to throw the same error with redis@2.6.1
but not redis@2.5.3
, although only very sporadically:
var redis = require('redis');
[
// big long list of { host: blah, port: blah }
].forEach(function(c){
var cli = redis.createClient(c.port, c.host);
cli.on('error', function(err) {
console.error(err);
console.error(err.stack);
});
cli.on('ready', function() {
cli.cluster('slots', function(){});
});
})
With only one host in the list it seems to be ~10% of the time, and it'll error with a [ReplyError: Protocol error, got "3" as reply type byte. Please report this.]
(where the "3"
can be "5"
or "\r"
or many other things).
Occasionally I'll also see this:
/home/jt/Development/testfoo/node_modules/redis/index.js:710
if (typeof command_obj.callback === 'function') {
^
TypeError: Cannot read property 'callback' of undefined
at normal_reply (/home/jt/Development/testfoo/node_modules/redis/index.js:710:27)
at RedisClient.return_reply (/home/jt/Development/testfoo/node_modules/redis/index.js:816:9)
at JavascriptRedisParser.Parser.returnReply (/home/jt/Development/testfoo/node_modules/redis/index.js:188:18)
at JavascriptRedisParser.execute (/home/jt/Development/testfoo/node_modules/redis/node_modules/redis-parser/lib/parser.js:392:12)
at Socket.<anonymous> (/home/jt/Development/testfoo/node_modules/redis/index.js:267:27)
at emitOne (events.js:77:13)
at Socket.emit (events.js:169:7)
at readableAddChunk (_stream_readable.js:153:18)
at Socket.Readable.push (_stream_readable.js:111:10)
at TCP.onread (net.js:531:20)
I'll check up on what redis versions are doing this (pretty sure there're some servers in my test list running old versions) to make sure it's not a silly redis bug, and then open some issues tomorrow when I haven't been staring at it for hours.
The error is very likely not an error on your side. The parser error can pretty much just be returned if there's a mistake in the parser code (or Redis returned faulty data and this is highly unlikable). That's why the error explicitly asks for being reported in the error message ;-).
But currently I don't really know where to search for the error. Please outcomment line 330 of the parser, increase the stack trace limit to a high value (e.g. 20) in the ReplyError line 7 and activate the debug mode (NODE_DEBUG=redis
). That output will hopefully help a lot finding the bug.
Ping @Salakar
Ok I tried commenting out the decreaseBufferPool interval and I'm still seeing both these weird errors.
I said yesterday that I could reproduce it with a single redis instance but I think I was wrong - it looks like it only happens with more than one client open. So it looks to me like it could be some sort of shared buffer being used by multiple clients.
Debugging output from two such cases here (for the Cannot read property 'callback' of undefined
error) and here (for the reply type byte error).
In both cases all the redis servers in question were running redis 3.0.5, with the test script from my earlier comment. The error doesn't happen reliably but I sometimes have to run the script a few times before seeing the error.
Would you like me to open an issue / multiple issues on redis-parser
for this? Happy to provide any other information that'll help track this down.
I also tried uncommenting the buffer_from_socket.toString()
log in node_redis line 266 and that produced this output which might help. In that case I'm using two hosts (10.0.105.126:6380 and 10.0.105.254:6380)
I found the issue. I forgot that Buffer.slice()
does not copy the buffer but instead returns a reference. Now if there are multiple parsers that receive data concurrently and the parser still holds some data from a former chunk that was sliced out of the bufferPool and the other parser uses the bufferPool too, the first parser holds faulty data in it's buffer from now on.
Glad I could help with tracking this down :) Would you like me to open an issue on node-redis-parser or shall I leave it to you to take it from here?
redis-parser@2.0.1
seems to resolve these issues, thanks @BridgeAR!
I'll do some more checking that everything's good between clustr and redis@2.6
and get an update on npm
Oops, I forgot to close this when I published v1.5.0. All is good now :+1:
Since node-redis update to 2.6 brought redis-parser 2.0.0, we've been seeing a lot of errors like this:
where the "8" there varies but seems to be a whole load of different values.
For the time being I'm going to lock down the
redis
dependency here to~2.5.0
until we have time to investigate further and figure out what exactly is going on with the newer redis parser.