Open nikhilro opened 1 month ago
I'm more confident that is an issue with Supavisor under load. A thing that would be helpful is postgres.js statement limit.
I know we have:
connection: {
statement_timeout: 1000 * 60 * 0.5, // 30 seconds, pg expects milliseconds
},
But, that is not useful when connecting to transaction mode poolers.
Is there an easy way to "abort" the query if it's taking too long?
Please forgive not answering your exact question but we have run into very similar issues and it boiled down to our queries being blocking & locking. We've used this query (modify as needed) to run down issues in our Supabase PG instance:
SELECT blocked_locks.pid AS blocked_pid,
blocked_activity.usename AS blocked_user,
blocking_locks.pid AS blocking_pid,
blocking_activity.usename AS blocking_user,
blocked_activity.query AS blocked_statement,
blocking_activity.query AS current_statement_in_blocking_process
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks
ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
AND blocking_locks.pid != blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.GRANTED;
Smashing this query over and over when the problem is happening may give some insight as well:
SELECT
query,
avg(now() - query_start) AS average_duration,
usename,
pid
FROM
pg_stat_activity
WHERE
state = 'active'
AND query <> '<IDLE>'
AND query NOT LIKE '%pg_stat_activity%'
GROUP BY
query,
usename,
pid
ORDER BY
average_duration DESC
LIMIT 10000;
If your logs aren't helpful, ye olde ALTER SYSTEM SET log_lock_waits TO on;
may yield more helpful info in the Postgres and/or Pooler logging views, as well as altering log_statement_sample_rate
and log_min_duration_sample
.
re: your actual question - I did implement something hideous with await Promise.race(...
which did not help. We did not have any specific issues with Supavisor but switched to running our own pgbouncers.
also likewise shout out to @porsager for making an amazing library that lets me avoid the hell of ORMs.
Hey @porsager, thanks for the package.
Creating this issue mostly in hopes that someone else has ran into this issue.
We use Supabase for our Postgres DB and connect through their pooling service Supavisor in transaction mode; Supavisor is an alternative to pgBouncer (link). On postgres.js side, we set
prepare: false
.What we're seeing is that the postgres.js pooler will start swallowing queries at some point. Imagine a function like:
After a while, this will print "trying to get the user" but not "got the user!". Does this ring any bells?
P.S. been trying to find an isolated script to reproduce but unsuccessful so far. the only reproduction is live production traffic.