Closed effjot closed 2 years ago
Ok for me.
@mdouchin What do you think about the cost of this view ? It should be alright.
If the query would be too expensive, could including these fields be set as a configuration option?
Maybe yes.
The locator makes the query for every characters written in the search text. So it's a lot of queries. Building the flat table for each requests might not be the best if you have a few simultaneous users and more "dataset" rows (with contacts and links).
Are you using it in production ? Can you monitor the execution time of this query on your instance ? How many dataset features do you have in your table ?
CC @mdouchin @drillinP
The function export_datasets_as_flat_table
uses the view v_export_table
, which is not very heavy IMHO. Since the dataset table will be of reasonable size for the average database, we can try it, and we could see how it reacts IRL.
Please answer @Gustry questions above about the size of the dataset table and if you can run a EXPLAIN ANALYSE
on the query ?
Thanks for your input! I didn’t quite realise that a search is done for every character typed.
I don’t have a production environment yet, but am preparing our switch to QGIS + PostgreSQL (we’re a team of about 15 gis users). At the moment, our test database isn’t very big. I can try to fill the DB with more data and do several parallel searches to get a feeling for how it reacts.
At the moment, I’m testing with two DBs (around 20 and around 5 datasets; I’m at home now, so I can’t tell the exact number). Tomorrow I’ll run the explain analyse
.
Thanks for your input! I didn’t quite realise that a search is done for every character typed.
Yes, the locator makes the SQL query to the database to fetch results.
It's possible to make a time delay in the locator with QGIS 3.18 : https://qgis.org/api/classQgsLocatorFilter.html#a5f17ae8096cab7a6a4e134d9edf46074
I now have 50 rows in the dataset
table. This is the explain analyse output:
geodaten=> explain analyse SELECT concat(d.title, ' (', d.table_name, '.', d.schema_name, ')') AS displayString, d.schema_name, d.table_name, d.geometry_type, title FROM pgmetadata.dataset d INNER JOIN pgmetadata.v_valid_dataset v ON concat(v.table_name, '.', v.schema_name) = concat(d.table_name, '.', d.schema_name) WHERE concat(d.title, ' ', d.abstract, ' ', d.table_name, ' ', d.categories, ' ', d.keywords, ' ', d.themes) ILIKE '%Nitrat%' LIMIT 100;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=64.03..78.87 rows=1 width=125) (actual time=0.737..2.885 rows=5 loops=1)
-> Nested Loop (cost=64.03..78.87 rows=1 width=125) (actual time=0.737..2.883 rows=5 loops=1)
Join Filter: (concat(d.table_name, '.', d.schema_name) = concat(d_1.table_name, '.', d_1.schema_name))
Rows Removed by Join Filter: 245
-> Seq Scan on dataset d (cost=0.00..14.78 rows=1 width=93) (actual time=0.134..2.102 rows=5 loops=1)
Filter: (concat(title, ' ', abstract, ' ', table_name, ' ', categories, ' ', keywords, ' ', themes) ~~* '%Nitrat%'::text)
Rows Removed by Filter: 45
-> Sort (cost=64.03..64.03 rows=2 width=37) (actual time=0.118..0.120 rows=50 loops=5)
Sort Key: d_1.schema_name, d_1.table_name
Sort Method: quicksort Memory: 29kB
-> Hash Join (cost=16.39..64.02 rows=2 width=37) (actual time=0.167..0.451 rows=50 loops=1)
Hash Cond: (((n.nspname)::text = (d_1.schema_name)::text) AND ((c.relname)::text = (d_1.table_name)::text))
-> Hash Join (cost=1.09..48.41 rows=56 width=128) (actual time=0.032..0.317 rows=181 loops=1)
Hash Cond: (c.relnamespace = n.oid)
-> Seq Scan on pg_class c (cost=0.00..46.69 rows=181 width=72) (actual time=0.011..0.258 rows=181 loops=1)
Filter: ((relname IS NOT NULL) AND (relkind = ANY ('{r,p}'::"char"[])))
Rows Removed by Filter: 834
-> Hash (cost=1.04..1.04 rows=4 width=68) (actual time=0.013..0.013 rows=14 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on pg_namespace n (cost=0.00..1.04 rows=4 width=68) (actual time=0.005..0.008 rows=14 loops=1)
-> Hash (cost=14.52..14.52 rows=52 width=37) (actual time=0.055..0.056 rows=50 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on dataset d_1 (cost=0.00..14.52 rows=52 width=37) (actual time=0.005..0.025 rows=50 loops=1)
Planning Time: 0.535 ms
Execution Time: 2.969 ms
(25 rows)
(the planning times are in the range 0.5 – 0.6 ms, execution times 2.8 – 3.1 ms)
In my test project I connect to 2 databases with metadata, but the second DB dataset
table has only a few rows. Searching with the locator bar is fast, I experienced no delay. At the moment, the server isn’t under heavy load though.
I’m working with QGIS 3.16, but will switch to 3.22 (next LTR) in October.
Thanks, let's see how it goes.
I have extended the locator search to keywords, categories, and themes (labels expanded according to locale). Maybe you’d like to add this to the main branch.