microsoftarchive / redis

Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes
http://redis.io
Other
20.86k stars 5.37k forks source link

PubSub reliability problem #475

Open alexs20 opened 8 years ago

alexs20 commented 8 years ago

Hi, I do not know if this issue already reported or not (at least I did not find it here), but want to share my concerns. I am trying to use a cluster of 3 Redis servers on Windows platform. And what I found is that under heavy load stress the Cluster loosing(or dropping) pub-sub messages. If the subscribers and publishers connected to the same node - then no problems, all messages delivering properly, but if them using different nodes then 10-20% of messages are gone. I tried to reproduce the same scenario on Redis cluster on Linux platform and all messages are coming to its destinations properly, no matter how heavy them loaded and which nodes used for pub and for sub. Any thoughts?

Thanks.

c4wrd commented 8 years ago

This is a documented issue with Redis, under high load the master server may be able to successfully propagate the message to slaves.

Here's an excerpt from the official Redis cluster spec:

A write may reach a master, but while the master may be able to reply to the client, the write may not be propagated to slaves via the asynchronous replication used between master and slave nodes. If the master dies without the write reaching the slaves, the write is lost forever if the master is unreachable for a long enough period that one of its slaves is promoted.

I'm still curious and would want to try this out, is it possible you can include some more details on the environment (such as the messages/sec, cluster information, etc.)?

alexs20 commented 8 years ago

It is very easy to reproduce this issue. In my case I have a cluster of 3 masters (without slaves) running on the same machine (Win 7 enterprise) on ports 7001, 7002 and 7003.

The publisher and subscriber are small java applications using Jedis library. I started one subscriber client connected to 7001 and 10 publishers randomly connected to any master node. Each publisher sends 100000 messages (the same message) to the single channel in the loop. Subscriber receiving them and counting. At end the subscriber should show 100000 x 10 messages but it always show less. When I am trying to run the same scenario on Linux box it is always shows expected number of messages. Please see the Pub/Sub code below.

// === SUBSCRIBER === // package tmp; import java.util.concurrent.atomic.AtomicInteger; import redis.clients.jedis.Jedis; import redis.clients.jedis.JedisPubSub;

public class PubSubReceive { static AtomicInteger ai = new AtomicInteger();

public static void main(String[] args) { JedisPubSub pubsubListener = new JedisPubSub() { @Override public void onMessage(String channel, String message) { ai.incrementAndGet(); System.out.println(ai.get()); } }; Jedis jedis = new Jedis("localhost", 7001); jedis.subscribe(pubsubListener, "thechannel"); jedis.close(); } }

// === PUBLISHER === // package tmp; import java.util.Random; import redis.clients.jedis.Jedis;

public class PubSubSend {

public static void main(String[] args) { Jedis jedis = new Jedis("localhost", 7001 + new Random().nextInt(2)); for (int i = 0; i < 100000; i++) { jedis.publish("thechannel", "msg"); } jedis.close(); } }