Closed joshbwlng closed 3 years ago
More EXPLAIN
result for a simpler query on production:
jellyfish=> explain (costs, verbose) select data->'from'->'type' from cards where type='link@1.0.0' limit 1000;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Limit (cost=0.56..260.63 rows=1000 width=32)
Output: (((data -> 'from'::text) -> 'type'::text))
-> Index Scan using link__name__idx on public.cards (cost=0.56..5699435.70 rows=21915317 width=32)
Output: ((data -> 'from'::text) -> 'type'::text)
(4 rows)
jellyfish=> explain (analyze, costs, buffers, verbose) select data->'from'->'type' from cards where type='link@1.0.0' limit 1000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..260.63 rows=1000 width=32) (actual time=0.048..1.640 rows=1000 loops=1)
Output: (((data -> 'from'::text) -> 'type'::text))
Buffers: shared hit=357 read=39
I/O Timings: read=0.559
-> Index Scan using link__name__idx on public.cards (cost=0.56..5699449.62 rows=21915373 width=32) (actual time=0.048..1.543 rows=1000 loops=1)
Output: ((data -> 'from'::text) -> 'type'::text)
Buffers: shared hit=357 read=39
I/O Timings: read=0.559
Planning Time: 0.791 ms
Execution Time: 1.736 ms
(10 rows)
jellyfish=> explain (analyze, costs, buffers, verbose) select data from cards where type='link@1.0.0' and (cards.data#>>'{"from", "type"}')::text='product-improvement@1.0.0' limit 5;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..260.63 rows=5 width=374) (actual time=40027.808..59088.741 rows=5 loops=1)
Output: data
Buffers: shared hit=2323222 read=3629084 dirtied=1132
I/O Timings: read=44288.287
-> Index Scan using link__name__idx on public.cards (cost=0.56..5701498.14 rows=109616 width=374) (actual time=40027.807..59088.738 rows=5 loops=1)
Output: data
Filter: ((cards.data #>> '{from,type}'::text[]) = 'product-improvement@1.0.0'::text)
Rows Removed by Filter: 14278524
Buffers: shared hit=2323222 read=3629084 dirtied=1132
I/O Timings: read=44288.287
Planning Time: 0.871 ms
Execution Time: 59088.757 ms
(12 rows)
jellyfish=> explain (analyze, costs, buffers, verbose) select data from cards where type='link@1.0.0' and (cards.data#>>'{"from", "type"}')::text='product-improvement@1.0.0' limit 1000;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1000.56..51363.67 rows=1000 width=374) (actual time=11556.092..45877.047 rows=26 loops=1)
Output: data
Buffers: shared hit=4096523 read=5334016 dirtied=20262
I/O Timings: read=111642.289
-> Gather (cost=1000.56..5521653.18 rows=109617 width=374) (actual time=11556.091..45877.041 rows=26 loops=1)
Output: data
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=4096523 read=5334016 dirtied=20262
I/O Timings: read=111642.289
-> Parallel Index Scan using link__name__idx on public.cards (cost=0.56..5509691.48 rows=45674 width=374) (actual time=11303.556..45873.382 rows=9 loops=3)
Output: data
Filter: ((cards.data #>> '{from,type}'::text[]) = 'product-improvement@1.0.0'::text)
Rows Removed by Filter: 7402458
Buffers: shared hit=4096523 read=5334016 dirtied=20262
I/O Timings: read=111642.289
Worker 0: actual time=10828.870..45872.607 rows=8 loops=1
Buffers: shared hit=1369245 read=1780992 dirtied=6690
I/O Timings: read=37194.538
Worker 1: actual time=11525.900..45872.884 rows=12 loops=1
Buffers: shared hit=1363763 read=1775441 dirtied=6750
I/O Timings: read=37200.183
Planning Time: 0.839 ms
Execution Time: 45877.069 ms
(24 rows)
SOLVED
The index definitions didn't match the queries. Redefining the indexes to the following solved the problem:
CREATE INDEX link__data_from_type__idx ON public.cards USING btree ((data#>>'{"from", "type"}')) WHERE (type = 'link@1.0.0'::text);
CREATE INDEX link__data_from_id__idx ON public.cards USING btree ((data#>>'{"from", "id"}')) WHERE (type = 'link@1.0.0'::text);
CREATE INDEX link__data_to_type__idx ON public.cards USING btree ((data#>>'{"to", "type"}')) WHERE (type = 'link@1.0.0'::text);
CREATE INDEX link__data_to_id__idx ON public.cards USING btree ((data#>>'{"to", "id"}')) WHERE (type = 'link@1.0.0'::text);
Tried the following indexes on the production database to see if the Postgres planner would use them over the current indexes. The results are promising as they seem to be a better match:
CREATE INDEX CONCURRENTLY link_test_idx ON public.cards USING btree (((data#>>'{"from","id"}')), name, ((data#>>'{"to","id"}'))) WHERE (type = 'link@1.0.0'::text);
Limit (cost=0.56..2.86 rows=1 width=32)
-> Index Scan using link_test_idx on cards (cost=0.56..2.86 rows=1 width=32)
Index Cond: (((data #>> '{from,id}'::text[]) = '1234'::text) AND (name = 'test'::text) AND ((data #>> '{to,id}'::text[]) = '5678'::text))
Filter: ((jsonb_typeof(data) = 'object'::text) AND (jsonb_typeof((data #> '{from}'::text[])) = 'object'::text) AND (jsonb_typeof((data #> '{to}'::text[])) = 'object'::text) AND (jsonb_typeof((data #> '{from,id}'::text[])) = 'string'::text) AND (jsonb_typeof((data #> '{to,id}'::text[])) = 'string'::text) AND (NOT (SubPlan 1)))
SubPlan 1
-> Function Scan on unnest contents (cost=0.00..0.15 rows=8 width=0)
Filter: ((contents <> ALL ('{user-jellyfish,org-balena}'::text[])) AND (contents !~ '(^|\+)(user-jellyfish|org-balena)($|\+)'::text))
CREATE INDEX CONCURRENTLY "support-thread_test_idx" ON public.cards USING btree ((data#>>'{"status"}')) WHERE (type = 'support-thread@1.0.0'::text);
Limit (cost=0.29..26.73 rows=1 width=32)
-> Index Scan using "support-thread_test_idx" on cards (cost=0.29..26.73 rows=1 width=32)
Index Cond: ((data #>> '{status}'::text[]) = 'open'::text)
Filter: ((jsonb_typeof(data) = 'object'::text) AND (jsonb_typeof((data #> '{status}'::text[])) = 'string'::text) AND (NOT (SubPlan 1)))
SubPlan 1
-> Function Scan on unnest contents (cost=0.00..0.15 rows=8 width=0)
Filter: ((contents <> ALL ('{user-jellyfish,org-balena}'::text[])) AND (contents !~ '(^|\+)(user-jellyfish|org-balena)($|\+)'::text))
CREATE INDEX CONCURRENTLY "summarytestidx" ON public.cards USING gin ((data#>'{"readBy"}')) WHERE (type = 'summary@1.0.0'::text);
Limit (cost=2.20..6.82 rows=1 width=32)
-> Bitmap Heap Scan on cards (cost=2.20..6.82 rows=1 width=32)
Recheck Cond: (((data #> '{readBy}'::text[]) @> '"test"'::jsonb) AND (type = 'summary@1.0.0'::text))
Filter: ((jsonb_typeof(data) = 'object'::text) AND (jsonb_typeof((data #> '{readBy}'::text[])) = 'array'::text) AND (NOT (SubPlan 1)))
-> Bitmap Index Scan on summary__test__idx (cost=0.00..2.20 rows=4 width=0)
Index Cond: ((data #> '{readBy}'::text[]) @> '"test"'::jsonb)
SubPlan 1
-> Function Scan on unnest contents (cost=0.00..0.15 rows=8 width=0)
Filter: ((contents <> ALL ('{user-jellyfish,org-balena}'::text[])) AND (contents !~ '(^|\+)(user-jellyfish|org-balena)($|\+)'::text))
Original conversation: https://jel.ly.fish/7950669f-fbb7-46f3-b95b-e2be9a393520
Useful commands
SDK Query
Generated SQL
EXPLAIN results (production)
Production index checks
Confirmed that the indexes are not invalid and are populated with data:
Index definitions (production)