stripe-archive / mosql

MongoDB → PostgreSQL streaming replication
MIT License
1.63k stars 225 forks source link

MoSQL Performance #111

Open BradRuderman opened 8 years ago

BradRuderman commented 8 years ago

I actually don't use MoSQL, I was just posting to see if anyone had a better way to handle storing the oplog time tracker. The idea of every time an oplog record is handled, it updates the db can be quite performance heavy. In fact between querying Mongo for the full record during an update and updating the database with the time, I am seeing 10 oplogs processed per second. I was thinking of using redis to store the time instead of postgres but it seems overkill to have redis setup to store a single key/value just this process. Just wanted to get other people's thoughts who are oplog tailing.

BTW my infrastructure is just a tailer which pushes the full record to a queue. So I have found the bottleneck and low hanging fruit to find a different but still persistent way of storing that time tracker.

nelhage commented 8 years ago

MoSQL relies on the idempotence of the oplog, and only updates the database once every minute or so. If your application can tolerate a small amount of replay in a failure situation, this is an easy trick.

10 updates per second seems quite slow, even for durable writes to postgres, if you have a remotely beefy database machine. Consider investing in tuning or profiling your database.