h0x91b / redis-fast-driver

77 stars 13 forks source link

Errors not being detected properly by `hiredis` on Linux #45

Open victorigualada opened 3 years ago

victorigualada commented 3 years ago

This issue is related to #22 and aiming to provide more information about the real problem.

As the author states in his PR, there is a reconnect issue that appears, but just in Linux (works in Mac, not tested on Windows).

Scenario

Create test app:

$ mkdir redistest && cd redistest
$ npm i redis-fast-driver

test.js:

const RedisClient = require('redis-fast-driver');

const cli = new RedisClient({
        host: '127.0.0.1',
        port: 6379,
        maxRetries: 25,
        autoConnect: true,
        doNotRunQuitOnEnd: true,
});

cli.on('ready', () => console.log('ready'));
cli.on('reconnecting', retry => console.log('reconnecting...', retry));
cli.on('error', e => console.log(e));

run app

$ node test.js

This should trigger the reconnect behaviour and try to reconnect to redis 25 times. But it doesn't. Why is it though?

Debugging

Setting the ENABLELOG property to 1 in the binding-gyp file: https://github.com/h0x91b/redis-fast-driver/blob/2429fa3419e4fbe362d9fdfc91c82eeae9cf0e1d/binding.gyp#L15 and adding some logging showed that the error (status=1 Connection Refused) that should be captured in the ConnectCallback is not being captured https://github.com/h0x91b/redis-fast-driver/blob/2429fa3419e4fbe362d9fdfc91c82eeae9cf0e1d/src/redis-fast-driver.cc#L61-L79

How can we prove that? Testing hiredis:

// Compilation
$ git clone https://github.com/redis/hiredis.git
$ cd hiredis
$ make
$ sudo make install
$ make test

// Test
$ cc examples/example-libuv.c -o hiredis-test -lhiredis -luv
$ ./hiredis-test

Add some extra logging if needed.

This execution proved that the same error (Connection Refused on loopback interface) IS being captured by the connectCallback. So it shouldn't be a problem of hiredis either.

Another important thing is that other errors are being correctly captured and thrown, and the reconnect behavior works as expected, for example, using an invalid host such as:

const cli = new RedisClient({
        host: 'invalid-host',
        port: 6379,
        maxRetries: 25,
        autoConnect: true,
        doNotRunQuitOnEnd: true,
});

This will throw a status=2 Temporarily failure in name resolution. Although not in the connect callback but immediately after the redisAsyncConnect call: https://github.com/h0x91b/redis-fast-driver/blob/2429fa3419e4fbe362d9fdfc91c82eeae9cf0e1d/src/redis-fast-driver.cc#L134-L149

Observations

We are out of ideas on which could be the root problem of this, and why is OS dependant. The last thing that seems to be the problem is the compilation made by node-gyp of hiredis. I'm not familiar with node-gyp so I couldn't go further into it but I saw this hiredis.gyp file with some conditions for MacOS and Solaris: https://github.com/h0x91b/redis-fast-driver/blob/2429fa3419e4fbe362d9fdfc91c82eeae9cf0e1d/deps/hiredis.gyp#L16-L24

Hope you can help with this problem! Thanks!

h0x91b commented 2 years ago

Tried to reproduce it in docker, but all fine...

docker run -it --rm -v $(pwd):/mac node:lts-alpine3.13 /bin/ash

cd /mac
apk add --no-cache --virtual .build-deps alpine-sdk python3
npm i -g node-gyp
node-gyp rebuild

node bug.js

output

/mac # node bug.js
Error: Name does not resolve
    at Redis._onConnect (/mac/index.js:119:26)
    at Redis.connect (/mac/index.js:54:18)
    at processImmediate (internal/timers.js:464:21)
reconnecting... 1
Error: Name does not resolve
    at Redis._onConnect (/mac/index.js:119:26)
    at Redis.connect (/mac/index.js:54:18)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
reconnecting... 2
Error: Name does not resolve
    at Redis._onConnect (/mac/index.js:119:26)
    at Redis.connect (/mac/index.js:54:18)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
reconnecting... 3
Error: Name does not resolve
    at Redis._onConnect (/mac/index.js:119:26)
    at Redis.connect (/mac/index.js:54:18)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Disconnected, exhausted retries.
    at Redis.reconnect (/mac/index.js:74:26)
    at Redis._onConnect (/mac/index.js:120:12)
    at Redis.connect (/mac/index.js:54:18)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Connection Timeout.
    at onError (/mac/index.js:49:26)
    at Timeout._onTimeout (/mac/index.js:57:9)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Disconnected, exhausted retries.
    at Redis.reconnect (/mac/index.js:74:26)
    at onError (/mac/index.js:50:12)
    at Timeout._onTimeout (/mac/index.js:57:9)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Connection Timeout.
    at onError (/mac/index.js:49:26)
    at Timeout._onTimeout (/mac/index.js:57:9)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Disconnected, exhausted retries.
    at Redis.reconnect (/mac/index.js:74:26)
    at onError (/mac/index.js:50:12)
    at Timeout._onTimeout (/mac/index.js:57:9)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Connection Timeout.
    at onError (/mac/index.js:49:26)
    at Timeout._onTimeout (/mac/index.js:57:9)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Disconnected, exhausted retries.
    at Redis.reconnect (/mac/index.js:74:26)
    at onError (/mac/index.js:50:12)
    at Timeout._onTimeout (/mac/index.js:57:9)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Connection Timeout.
    at onError (/mac/index.js:49:26)
    at Timeout._onTimeout (/mac/index.js:57:9)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
Error: Disconnected, exhausted retries.
    at Redis.reconnect (/mac/index.js:74:26)
    at onError (/mac/index.js:50:12)
    at Timeout._onTimeout (/mac/index.js:57:9)
    at listOnTimeout (internal/timers.js:557:17)
    at processTimers (internal/timers.js:500:7)
/mac #

bug.js

const RedisClient = require('./index');

const cli = new RedisClient({
  host: 'invalid-host',
  port: 6379,
  maxRetries: 3,
  autoConnect: true,
  doNotRunQuitOnEnd: true,
});

cli.on('ready', () => console.log('ready'));
cli.on('reconnecting', retry => console.log('reconnecting...', retry));
cli.on('error', e => console.log(e));

Do I miss something or it is related to some special version of Linux or Docker or Node maybe?

victorigualada commented 2 years ago

Right in Docker is working I don't know why I said that in the first place. The problem appears in Ubuntu 20.04 (maybe in other versions and distributions but I don't have the means to test it).

It is definitely something related to the C++ library, but that's how far my understanding of the problem goes. Do you think you can do some investigation on a Linux machine?

h0x91b-wix commented 2 years ago

I'll check if there is a more fresh version of hiredis available, maybe it is fixed already somewhere... BTW check please related issues in hiredis

victorigualada commented 2 years ago

I spent a week going through hiredis and testing it, and it worked properly, throwing an error when it should.. I explained it in my first comment Observations:

Observations hiredis works as expected and captures the Connection Refused error in the connectionCallback redis-fast-driver does not capture the Connection Refused error in the ConnectCallback Errors that shouldn't be captured by the ConnectCallback are properly thrown and the reconnect behavior works properly It works on MacOS but not in Linux or Docker

Just for information, my OS is Ubuntu 20.04.

h0x91b-wix commented 2 years ago

I see, will try to replicate it in VM when will have a time, thanks for details