citusdata / citus

Distributed PostgreSQL as an extension
https://www.citusdata.com
GNU Affero General Public License v3.0
10.42k stars 662 forks source link

End worker transactions early in complex SELECT queries to reduce idle in transaction time #2187

Open marcocitus opened 6 years ago

marcocitus commented 6 years ago

When a multi-shard SELECT query has a CTE or non-pushdownable subquery, we execute the query in a distributed transaction even if the SELECT is not in a transaction block. This means that we first send BEGIN on each connection and send commit/abort at the end of the transaction on the coordinator. Between the execution of individual steps, the connections are in an idle in transaction state. Postgres connections are a scarce resource and we should not allow them to be idle, since it increases the likelihood of the worker node running out of connections (or memory?).

The reason we use a distributed transaction when there is a CTE/non-pushdownable subquery is that the intermediate result is coupled to the distributed transaction ID. Queries can only access the intermediate result if they are part of the same distributed transaction and the result is removed when the distributed transaction ends. However, we can actually commit/abort individual worker transactions and later start a new transaction which is assigned the same distibuted transaction ID. To avoid adding extra round-trips, we could send the transaction block as a single batch BEGIN; SELECT assign_distributed_transaction_id(...); SELECT ... ; COMMIT. That way, the connection is released into the pgbouncer pool immediately after the SELECT finishes and can be used for other work.

None of this applies if the multi-shard SELECT is in a transaction block, especially if it needs to be able to read uncommitted results. However, it could provide much better throughput for single-statement, multi-shard SELECTs.

marcocitus commented 6 years ago

It may actually be worth doing this for all multi-shard SELECTs to allow operators to know which query on a worker belongs to which distributed query on the coordinator.