tulip / oplogtoredis

Publish MongoDB oplog entries to Redis
Apache License 2.0
38 stars 9 forks source link

oplogtoredis

Go Report Card GoDoc

This program tails the oplog of a Mongo server, and publishes changes to Redis. It's designed to work with the redis-oplog Meteor package.

Unlike redis-oplog's default behavior, which requires that everything that writes to Mongo also publish matching notifications to Redis, using oplogtoredis in combination with redis-oplog's externalRedisPublisher option guarantees that every change written to Mongo will be automatically published to Redis.

Project Status

The project is currently stable and used in production in Tulip. Before using this project in production, review the Known Limitations below, and test it out in a staging environment.

Known Limitations

There are a few things that don't currently work in redis-oplog when using the externalRedisPublisher option, so those features won't work when using redis-oplog together with oplogtoredis. These features are part of redis-oplog's fine-tuning options. If you don't use any of redis-oplog's fine-tuning options, you won't run into any of these limitations.

Nix

We are using Nix at Tulip so we're rolling a default.nix and a flake.nix wrapper around it for Flake support. The default.nix file contains a vendorHash attribute, which would be the calculated hash of the downloaded modules to be vendored. To update this value after a go.sum file change (which means that different modules were downloaded), refer to the following instructions:

  1. Edit default.nix and change the vendorHash attribute to an empty string then run nix build
  2. Run nix build and observe the output. You are looking for this section:

    error: hash mismatch in fixed-output derivation '/nix/store/by8bc99ywfw6j1i6zjxcwdmc5b3mmgzv-oplogtoredis-3.0.0-go-modules.drv':
          specified: sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
             got:    sha256-ceToA2DC1bhmg9WIeNSAfoNoU7sk9PrQqgqt5UbpivQ=
  3. Use the sha256-ceToA2DC1bhmg9WIeNSAfoNoU7sk9PrQqgqt5UbpivQ= for the value of vendorHash and verify that nix build succeeds.

MongoDB v5

MongoDB v5 makes a substantial change to the format of the oplog, which makes it much more difficult to assemble the list of changed fields from an oplog entry. Meteor has handled this with a full oplog v2 to v1 converter, which entails substantial complexity and a high level of dependence on the (undocumented) oplog format, and requires testing against every new version of Mongo. To reduce this maintenance burden, oplogtoredis implements a simplified version of this algorithm that just extracts changed top-level fields.

Take, for example the command: db.Foo.update({ _id: 'someId', }, { $set: { "one.two.three": 123, "four": 4 } }).

With MongoDB v4, oplogtoredis would produce the record {"e":"u","d":{"_id":"someId"},"f":["one.two.three", "four"]}. However, with MongoDB v5, oplogtoredis would instead produce the record {"e":"u","d":{"_id":"someId"},"f":["one", "four"]}.

This produces correct behavior when used with the redis-oplog Meteor package: we're simply indicating a superset of the actual change. However, it's possible that it produces sub-optimal performance. For example, say you're observing the query Foo.find({ _id: 'someId' }, { fields: { 'one.two.three': 1 }}). With MongoDB v4, the command db.Foo.update({ _id: 'someId', }, { $set: { "one.two.xx": 123 } }) would not be considered by redis-oplog to be a possible change to the observed query results (because it considers the sub-fields, and detects that a change to one.two.xx cannot affect the observed field one.two.three). However, with the new MongoDB v5 behavior, redis-oplog would detect that this could have changed the observed query results, because it just sees a change to the top-level field one.

This is a trade-off between optimal performance and maintainability: implementing a full v2-to-v1 conversion layer like Meteor does would provide slightly better performance in these specific cases, but with a much increased risk of incorrect behavior (e.g. not triggering an update when we should) if Mongo changes the oplog format, which they've indicated they reserve the right to do even in a patch release.

Configuring redis-oplog

To use this with redis-oplog, configure redis-oplog with:

For example, if your MONGO_URL for Meteor is mongodb://mymongoserver/mydb, you might use this config:

{
    "redisOplog": {
        "redis": {
            "port": 6379,
            "host": "myredisserver"
        },
        "globalRedisPrefix": "mydb.",
        "externalRedisPublisher": true
    }
}

Deploying oplogtoredis

You can build oplogtoredis from source with go build ., which produces a statically-linked binary you can run. Alternatively, you can use the public docker image. Previously we used to host these images in Docker Hub. These won't be maintained anymore.

Environment Variables

You must set the following environment variables:

You may also set the following environment variables to configure the level of logging:

There are a number of other environment variables you can set to tune various performance and reliability settings. See the config package docs for more details.

Running oplogtoredis in production

oplogtoredis includes a number of features to support its use in production-critical scenarios.

High Availability

oplogtoredis uses Redis to deduplicate messages, so it's safe (and recommended!) to run multiple instances of oplogtoredis to ensure availability in the event of a partial outage or a bug in oplogtoredis that causes it to crash or hang.

Just run two copies of oplogtoredis with the same configuration. Each copy will process each oplog message and send it to Redis, where we use a small Lua script to deduplicate the messages. It's not recommended to run more than 2 or 3 copies of oplogtoredis, because the load it puts on your Mongo and Redis databases increases linearly with the number of copies of oplogtoredis that you're running.

Resumption

oplogtoredis uses Redis to keep track of the last message it processed. When it starts up, it checks to see where it left off, and will resume tailing the oplog from that timestamp, as long as it's not too far in the past (we don't want to replay too much of the oplog, or we could could overload the Redis or Meteor servers). The config package docs have more detail on how to tune this behavior, but this feature should keep your system working propertly even if every copy of oplogtoredis that you're running goes down for a brief period.

Monitoring

oplogtoredis exposes an HTTP server that can be used to monitor the state of the program. By default, it serves on 0.0.0.0:9000, but you can change this with the environment variable OTR_HTTP_SERVER_ADDR.

The HTTP server exposes a health-checking endpoint at /healthz. This endpoint checks connectivity to Mongo and Redis, and then returns 200. If HTTP requests to this endpoint time out or return non-200 codes for more than a brief period (10-15 seconds), you should consider the program unhealthy and restart it. You can do this using Kubernetes liveness probes, an Icinga health check, or any other mechanism.

The HTTP server also exposes a Prometheus endpoint at /metrics that your Prometheus server can scrape to collect a number of useful metrics. In particular, if you see the value of the metric otr_redispub_processed_messages with the label status=sent fall to lower than the writes to your Mongo database, it likely indicates an issue with oplogtoredis.

Logging

oplogtoredis by default emits info, warning, and error messages as JSON, for easy consumption by structured logging systems. When debugging oplogtoredis, you may want to set OTR_LOG_DEBUG=true, which will log more detailed messages and log in a human-readable format for manual review.

Development

You can use go build to build and test oplogtoredis, or you can use the docker-compose environment (docker-compose up), which spins us a full environment with Mongo, Redis, and oplogtoredis.

The components of the docker-compose environment are:

You can optionally also spin up 2 meteor app servers with docker-compose -f docker-compose.yml -f docker-compose.meteor.yml up. These servers are running a simple todos app, using redis-oplog, and pointing at the same Mongo and Redis servers. Note that the first run will take a long time (5-10 minutes) while Meteor does initial downloads/builds; caching makes subsequent startups much, much faster.

The additional docker-compose.meteor.yml file contains:

Testing

There are a number of testing tools and suites that are used to check the correctness of oplogtoredis. All of these are run by Travis on each commit.

You can run all of the tests locally with scripts/runAllTests.sh.

Linting and static analysis

We use gometalinter to detect stylistic and correctness issues. Run scripts/runLint.sh to run the suite.

You'll need gometalinter and its dependencies installed. You can install them with go get github.com/alecthomas/gometalinter; gometalinter --install.

Unit tests

We use the standard go test tool. We wrap it as scripts/runUnitTests.sh to set timeout and enable the race detector.

Integration tests part 1: acceptance tests

These acceptance tests test a production-ready docker build of oplogtoredis. They run the production docker container alongside Mongo and Redis docker containers (using docker-compose), and then test the behavior of that whole system.

They are run both with and without the race detector, and against a number of different version of Mongo and Redis.

Run these tests with scripts/runIntegrationAcceptance.sh. This suite takes a while to run, because it's run against many different combinations of Redis, Mongo, and race detections. Use scripts/runIntegrationAcceptanceSingle.sh for a quick run with only a single configuration.

Integration tests part 2: fault-injection tests

These tests observe the behavior of oplogtoredis under a number of fault conditions, such as a crash and restart of oplogtoredis, and temporary unavailability of Mongo and Redis. To provide the test harness more control over the environment, they operate on a compiled binary of oplogtoredis rather than a docker image. They run inside a single docker container, with oplogtoredis, Mongo, and Redis spun up and down by the test harness itself.

Run these tests with scripts/runIntegrationFaultInjection.sh.

Integration tests part 3: meteor tests

These tests use docker-compose to spin up oplogtoredis, Mongo, Redis, and two Meteor application servers (running the app from ./testapp), and then connect to the Meteor servers via websockets/DDP and observe the behavior of traffic on the wire to ensure that oplogtoredis is working correctly in concert with redis-oplog.

Run these tests with scripts/runIntegrationMeteor.sh.

Integration tests part 4: performance tests

These tests are run in a similar environment to the acceptance tests, but only againt a single Mongo and Redis version, and without the race detector.

We run two tests: one where we just fire-and-forget a bunch of writes to Mongo, and one where we do the same writes, but wait until we've received notifications about all of those writes from Redis. The difference between these two numbers gives us an upper bound on the latency overhead of using oplogtoredis. These tests fail if the overhead is greater than 35%.

Run these tests with scripts/runIntegrationPerformance.sh.