LMFDB / lmfdb

L-Functions and Modular Forms Database
Other
246 stars 199 forks source link

L-function search too slow #4457

Closed jwj61 closed 2 years ago

jwj61 commented 3 years ago

From the feedback page:

https://www.lmfdb.org/L/rational?origin=DIR

produces "Error: The search query took longer than expected! Please help us improve by reporting this error here."

edgarcosta commented 3 years ago

I don't know how to tackle this one.

The origin=DIR triggers a containment query, but Postgres only uses the sorting order index and then filters the results a posteriori.

lmfdb=# explain analyze SELECT "origin" FROM "lfunc_search" WHERE "rational" = true AND "instance_types" @> ARRAY[ARRAY['DIR']]::text[] ORDER BY "root_analytic_conductor" LIMIT 1000;
                                                                                 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..8629.28 rows=1000 width=47) (actual time=0.032..14414.905 rows=1000 loops=1)
   ->  Index Scan using lfunc_search_root_analytic_conductor on lfunc_search  (cost=0.56..5885157.62 rows=682043 width=47) (actual time=0.031..14412.729 rows=1000 loops=1)
         Filter: (rational AND (instance_types @> '{{DIR}}'::text[]))
         Rows Removed by Filter: 15692517
 Planning Time: 0.428 ms
 Execution Time: 14417.163 ms

The time is much better now as I have been trying different approaches for a while, so the table is partially cached.

I think we need to start to think about advanced statistics https://www.postgresql.org/docs/13/planner-stats.html#PLANNER-STATS-EXTENDED

but is not clear to me what statistic I should add to help the planner.

edgarcosta commented 3 years ago

I thought more about this and I think the best solution is to change the data structure and have a boolean column per type, so we can index those properly.

roed314 commented 3 years ago

I've added columns is_instance_DIR, etc, and am filling them in now. I'll make some code changes once the data is there.

roed314 commented 3 years ago

The data is there on beta, but I'm not going to be able to make the code changes tonight.

edgarcosta commented 3 years ago

The data is now on prod. I will write the code on Friday

edgarcosta commented 2 years ago

@rbommel, perhaps is another good speed up. At the moment we search like input_type in types While we should just check if type_{input_type} is true.

rbommel commented 2 years ago

@edgarcosta, I cannot reproduce this issue, both online and locally, these pages seem to be pretty fast. Maybe @roed314 already solved this?

edgarcosta commented 2 years ago

No, the searches are still done the slow way. Here are a couple of slow queries that we saw last week

2022-01-16 04:54:14.565 UTC [50717] webserver@lmfdb LOG:  duration: 8561.177 ms  statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "degree" = 4 AND "rational" = true AND "instance_types" @> ARRAY[ARRAY['ECNF']]::text[] ORDER BY "root_analytic_conductor" LIMIT 50
2022-01-16 16:40:20.901 UTC [177307] webserver@lmfdb LOG:  duration: 19131.811 ms  statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "instance_types" @> ARRAY[ARRAY['BMF']]::text[] ORDER BY "root_analytic_conductor" LIMIT 1000
2022-01-17 18:37:06.171 UTC [448548] webserver@lmfdb LOG:  duration: 5969.560 ms  statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "degree" = 4 AND "instance_types" @> ARRAY[ARRAY['ECNF']]::text[] ORDER BY "root_analytic_conductor" LIMIT 50
2022-01-21 19:22:18.691 UTC [1509034] webserver@lmfdb LOG:  duration: 8285.022 ms  statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "degree" = 4 AND "instance_types" @> ARRAY[ARRAY['ECNF']]::text[] ORDER BY "root_analytic_conductor" LIMIT 50
2022-01-21 20:53:09.030 UTC [1526089] webserver@lmfdb LOG:  duration: 18206.163 ms  statement: SELECT "algebraic", "analytic_conductor", "bad_primes", "central_character", "conductor", "degree", "instance_urls", "label", "motivic_weight", "mu_real", "mu_imag", "nu_real_doubled", "nu_imag", "order_of_vanishing", "primitive", "rational", "root_analytic_conductor", "root_angle", "self_dual", "z1" FROM "lfunc_search" WHERE "instance_types" @> ARRAY[ARRAY['ECQ']]::text[] ORDER BY "root_analytic_conductor" LIMIT 1000
rbommel commented 2 years ago

Pull request #4959 should improve this. But I did notice that indices seem to be missing for some of these columns. To really get improvement, these indices should be added to lfunc_search.

edgarcosta commented 2 years ago

indexes added