Open fmbiete opened 1 year ago
What about using this logic when pg_is_in_recovery ?
{
semver.MustParseRange(">=9.4.0 <10.0.0"),
`
- SELECT slot_name, database, active, pg_xlog_location_diff(pg_current_xlog_location(), restart_lsn)
+ SELECT slot_name, database, active,
+ (case pg_is_in_recovery() when 't' then pg_xlog_location_diff(pg_last_xlog_receive_location(), restart_lsn) else pg_xlog_location_diff(pg_current_xlog_location(), restart_lsn) end) as pg_xlog_location_diff
FROM pg_replication_slots
`,
},
{
semver.MustParseRange(">=10.0.0"),
`
- SELECT slot_name, database, active, pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
+ SELECT slot_name, database, active,
+ (case pg_is_in_recovery() when 't' then pg_wal_lsn_diff(pg_last_wal_receive_lsn(), restart_lsn) else pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) end) as pg_wal_lsn_diff
FROM pg_replication_slots
`,
},
What about using this logic when pg_is_in_recovery ?
That doesn't throw an error but there are some buts.
_pg_last_wal_receivelsn() can returns NULL, so we would need to make the statement resilient to it
I'm not sure of the correctness of that data, it shows 2 different things depending of the server status
The value in the case of being in recovery could be quite off, compared to the normal value.
What do you think?
I am not sure what is the best solution here, but anything would be better than the original failing query.
In any case, the patch proposed here leads to no metrics being sent back from the standby at all, is that what we want?
Yes, that query should only be executed in the primary.
pg_last_wal_receive_lsn()
instead of pg_current_wal_lsn()
as a workaround.
WAL functions cannot be executed in standby servers
In PostgreSQL 16 replication slots persist in the standby servers. This is also the case when using extensions like pg_failover_slots that transfer the slot information to the standby.
This condition will make those queries to return values only in the primary node avoiding the errors.