flightaware / Pgtcl

Tcl client side interface to PostgreSQL (libpgtcl)
https://flightaware.github.io/Pgtcl/
BSD 3-Clause "New" or "Revised" License
31 stars 10 forks source link

copy Postgres to SQLite, get last LSN #52

Open tantaman opened 4 months ago

tantaman commented 4 months ago

I saw that pgtcl can be used to quickly copy a Postgres DB to SQLite. Once I do this, however, how can I know what LSN I should start from to continue replicating (via logical replication) the PG DB to SQLite?

resuna commented 4 months ago

That's what I implemented deltaflood for. The work is probably incomplete however, the project that required it changed its focus. Have a look at https://github.com/flightaware/pg-deltaflood

resuna commented 4 months ago

Yes, the reader side is still internal/research quality and hasn't been released.

tantaman commented 4 months ago

How likely is it the reader side would be released?

resuna commented 4 months ago

I would have to go through and remove any proprietary code and then see if what's left is functional, and if not implement the missing bits. Your reader is probably going to be application-specific. You need to read a line, split it on tabs into name-value pairs, then perform a backslash escape substitute operation on the value to parse it. The Tcl [subst] command would do that. Check the README.md for how to handle the update and replace operations on the new database, depending on the primary key.

resuna commented 4 months ago

Yeh, looking at our code, we're taking a side-route via Kafka you probably don't want to get involved in. :)

resuna commented 4 months ago

You can also look at Debezium, though it doesn't have an sqlite connector yet, you'd have to write that any way.

NasaGeek commented 4 months ago

One could probably use the JDBC Debezium connector combined with a sqlite JDBC library like https://github.com/xerial/sqlite-jdbc to perform the replication.

tantaman commented 4 months ago

Also, any idea how fast your pg to SQLite conversion is? Say for a 10GB db of 30 million rows.