Open j3k0 opened 7 years ago
I'm wondering whether 1 would work. Wouldn't it connect to same instance given same env vars?
Starting a new instance with same env var work. It just needs to close the tcp connection with the old master
Are you sure? Redis docs mention a command for obtaining address of new master, and broadcasting configuration in case failover occurs. This package's readme also has something like that mentioned.
Or is that part handled by docker image and we simply reconnect to the same host:port
? Just want to make sure :)
Yes, it should just reconnect to the same host:port
, it's supposedly transparent to the client. In prod we connect clients to a so-called "ambassador", an haproxy instance that transmits redis TCP connections to the master host (in priority) and falls back to a slave if master is not responding, so read operations are still handled.
When I think about it, the problem is probably that haproxy doesn't close opened TCP connections when the master changes. This might be interesting to test and fix (if possible), eventually clients won't need any change in this case.
and falls back to a slave if master is not responding
Yeah, but If I understand correctly, we want to reconnect to a new master selected by sentinel failover, so that writes succeed as well. But if we, say, listen for a new master with sentinel pub/sub or query for info replication
(or switch to ioredis
which has same API for issuing commands but handles sentinels automatically), we might receive different host and/or port, which I'm not sure is available from our docker container, I assume we only see haproxy from there.
Anyways, what I started doing is to wrap all redis commands with our own class, and monitor errors. If something like in your original post occurs, we'll discard current client and create a new one.
And later, we can add a check to ping API as well.
Okay, here's first pass, should work almost as drop-in replacement for redis
(at least for what we use and config object not being root level constructor option). It wraps redis client and for each command executed checks whether we should reconnect with passed in function (that tests for READONLY
by default). Differences:
multi()
does not support nested transactions (so no redis.multi().multi()
allowed);multi()
only works via chaining (no args accepted, node-redis docs mention some optional thingy, not sure what that is);batch()
is not supported.Please check it out, let me know what you think. I'm not quite comfortable judging it only by myself (unless you solved this one some other way already). Overall, this sounds okay to me, but if it's possible to switch to ioredis
given Docker situation, I'd rather we do that :)
I'll check it out in a few days (I'm on the move those days).
Please for the next task, make sure you check waffle: there are other things more urgent to take care of. This one was an issue that occurred just once in the last couple of years, nothing I'd classify as high priority.
Scenario:
ganomede-invitation
instance didn't close the connection with the original redis master server. So it's now connected with the slave.Possible solutions:
INFO REPLICATION
on all linked redis databases from/ping/
, to see if we're connected to master instances. Fail if not.Maybe
1
is good enough and easy to put in place, since db requests are pretty centralized?