Open hanlaur opened 4 years ago
@hanlaur It is unfortunate common issue with prepare and PG automation. After the fifth execution of prepared statement PG switches to Generic plan. From PG docs:
The current rule for this is that the first five executions are done with custom plans and the average estimated cost of those plans is calculated.
The generic plan doesn't work well to optimise queries with hypertables, thus to overcome this set plan_cache_mode
to force_custom_plan
.
Thanks @k-rus. Yes agree the force_custom_plan
helps. So these later stage exclusions like the Stage 3: Executor runtime exclusion (https://blog.timescale.com/blog/implementing-constraint-exclusion-for-faster-query-performance/) do not help here at all?
So these later stage exclusions like the Stage 3: Executor runtime exclusion (https://blog.timescale.com/blog/implementing-constraint-exclusion-for-faster-query-performance/) do not help here at all?
@hanlaur Since the time condition comes as parameter in generic plan, no chunks can be excluded during the planning as the time will be known at runtime only.
Runtime exclusion should work here, while plantime exclusion is certainly preferred it's still better than no exclusion at all.
I'm hit by this problem as well, in my case being on PG11 the remediation to force the custom_plan is not a solution.
You can work around this on PG11 with dynamic sql by creating a function like this:
dev=# CREATE OR REPLACE FUNCTION f1(i1 int, i2 int, devices text[]) RETURNS INT LANGUAGE 'plpgsql'
AS $BODY$
DECLARE
result int;
BEGIN
EXECUTE 'SELECT max(val) FROM skip_scan_ht WHERE time >= $1 AND time > $2 AND dev_name = ANY($3)' INTO result USING i1, i2, devices;
RETURN result;
END
$BODY$;
I wanted to let folks know that this can lead to rather catastrophically poor-performing queries; milliseconds to at least 10 minutes. In our case this happens when the first 5 calls are for very long time-ranges with little to no data in them (subject to an ID filter); after that, every call (even on a very tight time range) takes orders of magnitude longer than with a dynamic plan. Run-time chunk exclusion is not enough to save us in these cases!
Hello, I just want to note that this behavior should be in documentation. I have run into this while querying data using asyncpg which creates prepared statement for every query over hypertable with 76000 childtables which lead to out-of-shared memory error because while planning (I guess) it tried to include all the child tables
Relevant system information:
Describe the bug Scenario: Using prepared statements to query data. Time period used is
1 day
and table chunking is1 day
.Irrelevant time chunks are excluded for first 5 iterations of query as expected = max two chunks scanned. But from 6th execution and onwards, PostgreSQL statistics indicate index blocks are being read for all chunks, even though it should be possible to exclude the chunks.
Setting
plan_cache_mode: force_custom_plan
(https://www.postgresql.org/docs/12/sql-prepare.html) appears to help avoid the behavior.Is this a bug or is there a limitation related to generic plans?
To Reproduce Complete sequence to reproduce, including table definitions and explain output attached.
docker run -d --name pg -e POSTGRES_PASSWORD=pass timescale/timescaledb:1.7.4-pg12-oss
docker exec -i pg psql -U postgres < explain.sql
explain.sql: https://gist.github.com/hanlaur/c8165c01ee6e6302008beff842b453e8 (explains only, with outputs as comments) execute.sql: https://gist.github.com/hanlaur/1829cd98d307a0e5065ff508e0b8b116 (execution + pg_statio_user_tables)
explain_execute.zip
Expected behavior Irrelevant time chunks being excluded also on 6th execution (possibly with exclusion happening in later stage)
Actual behavior Index blocks of irrelevant time chunks are being scanned.