Closed jwj61 closed 2 years ago
I don't know how to tackle this one.
The origin=DIR
triggers a containment query, but Postgres only uses the sorting order index and then filters the results a posteriori.
lmfdb=# explain analyze SELECT "origin" FROM "lfunc_search" WHERE "rational" = true AND "instance_types" @> ARRAY[ARRAY['DIR']]::text[] ORDER BY "root_analytic_conductor" LIMIT 1000;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.56..8629.28 rows=1000 width=47) (actual time=0.032..14414.905 rows=1000 loops=1)
-> Index Scan using lfunc_search_root_analytic_conductor on lfunc_search (cost=0.56..5885157.62 rows=682043 width=47) (actual time=0.031..14412.729 rows=1000 loops=1)
Filter: (rational AND (instance_types @> '{{DIR}}'::text[]))
Rows Removed by Filter: 15692517
Planning Time: 0.428 ms
Execution Time: 14417.163 ms
The time is much better now as I have been trying different approaches for a while, so the table is partially cached.
I think we need to start to think about advanced statistics https://www.postgresql.org/docs/13/planner-stats.html#PLANNER-STATS-EXTENDED
but is not clear to me what statistic I should add to help the planner.
I thought more about this and I think the best solution is to change the data structure and have a boolean column per type, so we can index those properly.
I've added columns is_instance_DIR
, etc, and am filling them in now. I'll make some code changes once the data is there.
The data is there on beta, but I'm not going to be able to make the code changes tonight.
The data is now on prod. I will write the code on Friday
@rbommel, perhaps is another good speed up.
At the moment we search like input_type in types
While we should just check if type_{input_type}
is true.
@edgarcosta, I cannot reproduce this issue, both online and locally, these pages seem to be pretty fast. Maybe @roed314 already solved this?
No, the searches are still done the slow way. Here are a couple of slow queries that we saw last week
2022-01-16 04:54:14.565 UTC [50717] webserver@lmfdb LOG: duration: 8561.177 ms statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "degree" = 4 AND "rational" = true AND "instance_types" @> ARRAY[ARRAY['ECNF']]::text[] ORDER BY "root_analytic_conductor" LIMIT 50
2022-01-16 16:40:20.901 UTC [177307] webserver@lmfdb LOG: duration: 19131.811 ms statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "instance_types" @> ARRAY[ARRAY['BMF']]::text[] ORDER BY "root_analytic_conductor" LIMIT 1000
2022-01-17 18:37:06.171 UTC [448548] webserver@lmfdb LOG: duration: 5969.560 ms statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "degree" = 4 AND "instance_types" @> ARRAY[ARRAY['ECNF']]::text[] ORDER BY "root_analytic_conductor" LIMIT 50
2022-01-21 19:22:18.691 UTC [1509034] webserver@lmfdb LOG: duration: 8285.022 ms statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "degree" = 4 AND "instance_types" @> ARRAY[ARRAY['ECNF']]::text[] ORDER BY "root_analytic_conductor" LIMIT 50
2022-01-21 20:53:09.030 UTC [1526089] webserver@lmfdb LOG: duration: 18206.163 ms statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "instance_types" @> ARRAY[ARRAY['ECQ']]::text[] ORDER BY "root_analytic_conductor" LIMIT 1000
Pull request #4959 should improve this. But I did notice that indices seem to be missing for some of these columns. To really get improvement, these indices should be added to lfunc_search.
indexes added
From the feedback page:
https://www.lmfdb.org/L/rational?origin=DIR
produces "Error: The search query took longer than expected! Please help us improve by reporting this error here."