samstokes commented 8 years ago

Right now a few things make it tricky to trial Bottled Water on an existing database taking production traffic:

the initial snapshot does a SELECT * from all tables, which will probably do bad things to the database's cache unless the entire dataset (plus indices) can fit in RAM;
on a large database, you may not even want to replay the entire contents into Kafka first, since your Kafka consumers may not care about older records, and Kafka machines may not be provisioned with as much disk space as Postgres machines;
if your Kafka cluster is configured with auto.create.topics.enable=false, you'll need to create the appropriate topics for every table before starting Bottled Water;
we might want to run Bottled Water on a standby server, so as to stream actual data at realistic volume without risking the performance or stability of the primary server; but this isn't possible since Postgres doesn't support logical decoding on a standby at present.

A related problem is what to do if Bottled Water is running on a primary server that fails, and we promote a standby to become the new primary. To keep the Kafka updates flowing, we probably want to point Bottled Water at the new primary, but:

if the replication slot doesn't get replicated over to the standby (not sure), then Bottled Water will start by redoing the snapshot, which will result in publishing duplicate updates to Kafka for the entire database;
depending on the mechanics of promoting the primary, we may need to start Bottled Water on a different timeline (e.g. if the primary was restored from an on-disk snapshot plus WAL replay, then began its own timeline)

This encompasses a lot of individual changes we might make (e.g. add a table whitelist, add a --skip-snapshot flag, integrate Zookeeper support so Bottled Water can create its own Kafka topics, etc), but it would be useful to have a plan for how this ought to work. Creating this issue to collect discussion about this topic!

badboyd commented 8 years ago

The initial snapshot does a SELECT * from all tables, which will probably do bad things to the database's cache unless the entire dataset (plus indices) can fit in RAM;

It can be worse, It can lock other transactions. BW do the snapshot like this SELECT row by row from all tables and non-stop. So I have a proposal for BW snapshot: fetch number of rows( for example 5000) send all to kafka, then wait for ACK then fetch next 5000 rows ? OR Can we do the snapshot on a standby server ? What do you think @samstokes ?

badboyd commented 8 years ago

101

samstokes commented 8 years ago

It can be worse, It can lock other transactions. BW do the snapshot like this SELECT row by row from all tables and non-stop. So I have a proposal for BW snapshot: fetch number of rows( for example 5000) send all to kafka, then wait for ACK then fetch next 5000 rows ? OR Can we do the snapshot on a standby server ?

Are you sure that the SELECT can block other transactions? My understanding was that Postgres' MVCC design meant that queries running on a consistent snapshot (at the REPEATABLE READ isolation level) would execute without blocking other transactions.

badboyd commented 8 years ago

Are you sure that the SELECT can block other transactions? My understanding was that Postgres' MVCC design meant that queries running on a consistent snapshot (at the REPEATABLE READ isolation level) would execute without blocking other transactions.

Yeah, It can not block other transactions according to Doc. But, when I start BW snapshot, other transactions start executing for much much more longer time than usual, some transaction cannot execute.

samstokes commented 8 years ago

I guess that makes sense - the long-running SELECT will cause a lot of CPU and I/O, and lots of table bloat if there is high write volume, both of which could slow down other queries.

some transaction cannot execute.

Do you mean that you observe some transactions aborting with an error? Do you see any specific error message, or are they just timing out?

I suppose to support snapshots of large databases we might need to either run the snapshot on a standby like you suggest, or teach Bottled Water how to restore Postgres backups directly.

samstokes commented 8 years ago

Unfortunately, according to this blog post, it doesn't look like there's currently a way to have a logical replication consumer like Bottled Water start consuming from a specific point in the past, which we'd need to avoid missing updates after either bootstrapping from a standby server or failing over to a promoted standby.

confluentinc / bottledwater-pg

Design for adding to an existing database #99

101