The issue essentially occurs when we are streaming multiple Postgres results through multiple different scans using the same connection. This issue is unique to the new ATTACH method as the attach method has a "main connection" - whereas the postgres_scan methods would simply always open new connections.
We solve this issue by doing the following:
By default, a Postgres scan is not streaming and instead fully materializes the query result. This will always work regardless of other operators in the query.
There is a new optimization pass (PostgresOptimizer) that looks at a given query. If the query contains only a single scan of a Postgres table, we can always stream the query even over the main connection. If the query contains multiple scans, we can only stream the query over newly opened connections (or we need to materialize the query result).
This solves the issue and allows us to re-use the main connection as much as possible, while being correct in all cases (both with and without the optimizer run).
This PR also turns the connection cache back on by default and changes the setting to pg_connection_cache (instead of pg_experimental_connection_cache). In addition, we also add cardinality measurement to the scan (PostgresScanCardinality).
Fixes #156
The issue essentially occurs when we are streaming multiple Postgres results through multiple different scans using the same connection. This issue is unique to the new
ATTACH
method as the attach method has a "main connection" - whereas thepostgres_scan
methods would simply always open new connections.We solve this issue by doing the following:
PostgresOptimizer
) that looks at a given query. If the query contains only a single scan of a Postgres table, we can always stream the query even over the main connection. If the query contains multiple scans, we can only stream the query over newly opened connections (or we need to materialize the query result).This solves the issue and allows us to re-use the main connection as much as possible, while being correct in all cases (both with and without the optimizer run).
This PR also turns the connection cache back on by default and changes the setting to
pg_connection_cache
(instead ofpg_experimental_connection_cache
). In addition, we also add cardinality measurement to the scan (PostgresScanCardinality
).