Subscribing to a publication causes WAL to bloat on database if there is not changes being made to the publishing tables while modification is being made on other non-publishing tables.
Reproduction
Setup the following via psql
create table tbl_a (id int);
create table tbl_b (id int);
create publication pub_a for table tbl_a;
Now subscribe to the publication.
import postgres from 'postgres'
const sql = postgres({..., publications: ['pub_a']});
sql.subscribe('*',async (row, meta) => {});
setInterval(async () => {
// writes to non-publishing table
await sql`insert into public.tbl_b (id) select generate_series(0, 1000)`;
console.log(await sql`select sent_lsn, write_lsn from pg_stat_replication`)
}, 5000)
Observe the sent_lsn and write_lag keep increasing, while write_lsn will stay the same forever. This causes WAL to bloat as the database will hold on to the old logs.
Now if instead of writing to tbl_b, we update our example to write to tbl_a instead, which is part of the publication, now the write_lsn will increment over time as expected.
Description
Subscribing to a publication causes WAL to bloat on database if there is not changes being made to the publishing tables while modification is being made on other non-publishing tables.
Reproduction
Setup the following via psql
Now subscribe to the publication.
Observe the sent_lsn and write_lag keep increasing, while write_lsn will stay the same forever. This causes WAL to bloat as the database will hold on to the old logs.
Now if instead of writing to tbl_b, we update our example to write to tbl_a instead, which is part of the publication, now the write_lsn will increment over time as expected.
Problem The
state.lsn
currently is only being updated after receiving a Begin Message. But this type of message is only be sent by walsender when there is changes to the publishing table.https://github.com/porsager/postgres/blob/6f20a4820c683b33e7670b606d8daf5670f4b973/src/subscribe.js#L178-L181
I think the state.lsn should also be update here on receiving a Primary Keep Alive. https://github.com/porsager/postgres/blob/6f20a4820c683b33e7670b606d8daf5670f4b973/src/subscribe.js#L112 At the moment, postgres.js responses to the primary keep alive message by using the current state.lsn without updating the lsn first. https://github.com/porsager/postgres/blob/6f20a4820c683b33e7670b606d8daf5670f4b973/src/subscribe.js#L126-L133 This causes the database to think our replication process is stalled and hold on the old WAL indefinitely.