High Availability mode - Githubissues

Right now, we publish everything we read from the oplog to a Redis channel. This means that if you're running two copies of oplogtoredis, you'll end up with duplicate messages send to redis.

Instead, we could do something like:

ok := SET <unique id from oplog> true NX EX <expiration>
if ok {
  // message was not already published
  PUBLISH <channel> <msg>
}

Which would use Redis to de-duplicate messages based on the unique ID assigned to all oplog operations.

This gives us a couple nice advantages:

You can run two copies of oplogtoredis, and if one crashes, or becomes unavailable, etc., the other one will continue to process messages without any disruption. Combined with a Mongo replica set, this gives us high-availability and the ability to do things like smoothly fail over of an AWS AZ fails, for example
If you're running two copies of oplogtoredis and one freezes (because the machine it's on is busy, or there's a GC pause, etc.), the other one will pick up the slack
You can perform zero-downtime rolling upgrades op oplogtoredis by running the new version, then shutting down the old version (using something like a Kubernetes deployment or other orchestration tool).

tulip / oplogtoredis

High Availability mode #12