Closed CatalinMustata closed 1 year ago
I've managed to fix this and I have some takeaways on this.
Due to various reasons, our PGBouncer instance is set to transaction
and this cannot be changed. One important side effect of this is that simple queries executed outside of a transaction are considered as being a transaction themselves from PGBouncer's perspective. That means that two subsequent queries to the DB will possibly receive different connections to the DB - my assumption is that asyncpg
assumes the same connection between prepare & query requests.
That, added to the fact that asyncpg
does not have truly unique names, so if running multiple workers (and pods), each of them will start off with the same index and increase it monotonically. Assuming an even spread of requests, they're bound to have name clashes at one point, depending on what actual DB connection they receive from PGBouncer.
The solution, in our case was to re-enable implicit transactions in SQLAlchemy (we had them turned off for our read replicas) and subclass asyncpg's Connection
class so we can overwrite _get_unique_id
to use an uuid
instead of a monotonic index.
Should be fixed by #1065 which will be released in 0.29 soon.
I know the compatibility with PGBouncer in transaction mode has been discussed over and over again and we've already disabled the prepared statement cache, which made 1 out of 2 queries to fail.
However, we're still seeing
prepared statement "__asyncpg_stmt_XXX__" does not exist
popping up now and again in middleware executed by FastAPI before handling a request (no threads, although we are running a few things in a sync fashion using threads and psycopg2). I haven't been able to observe a specific scenario where this happens.Since there's no chance to change the PGBouncer config in the near future, I'd be grateful if someone could give me some advice on how to debug this. It could be because of our setup and maybe we can fix it by adjusting one setting or another.
PS: Locally, I've had some issues with the "statement already exists" error when code crashed while executing a query, but that's not something that happens when deployed.