aodn / content

Tracks AODN Portal content and configuration issues
0 stars 0 forks source link

ANMN NRS CTD - SCHEMA anmn_nrs_ctd_profiles missing INDEX on "TIME" #486

Closed lbesnard closed 2 years ago

lbesnard commented 2 years ago

Running queries on the anmn_nrs_ctd_profiles using the "TIME" column is really time consuming. This would create some issues for users on the portal. I ran a query which took 12 hours.

This shows the definition of the WFS view anmn_nrs_ctd_profiles_data. We can see that this depends on the measurements table:

select *  from pg_catalog.pg_views where  schemaname like '%anmn_nrs_ctd_profiles%' and viewname = 'anmn_nrs_ctd_profiles_data'
+-----------------------+----------------------------+-----------------------+----------------------------------------------------------------------------------------+
| schemaname            | viewname                   | viewowner             | definition                                                                             |
|-----------------------+----------------------------+-----------------------+----------------------------------------------------------------------------------------|
| anmn_nrs_ctd_profiles | anmn_nrs_ctd_profiles_data | anmn_nrs_ctd_profiles |  SELECT measurements.file_id,                                                          |
|                       |                            |                       |     a.site_code,                                                                       |
|                       |                            |                       |     a.cruise_id,                                                                       |
|                       |                            |                       |     timezone('UTC'::text, a.time_coverage_start) AS time_coverage_start,               |
|                       |                            |                       |     timezone('UTC'::text, a.time_coverage_end) AS time_coverage_end,                   |
|                       |                            |                       |     timezone('UTC'::text, measurements."TIME") AS "TIME",                              |
|                       |                            |                       |     measurements."INSTANCE",                                                           |
|                       |                            |                       |     measurements."DIRECTION",                                                          |
|                       |                            |                       |     measurements."TIME_quality_control",                                               |
|                       |                            |                       |     measurements."LATITUDE",                                                           |
|                       |                            |                       |     measurements."LATITUDE_quality_control",                                           |
|                       |                            |                       |     measurements."LONGITUDE",                                                          |
|                       |                            |                       |     measurements."LONGITUDE_quality_control",                                          |
|                       |                            |                       |     measurements."DEPTH",                                                              |
|                       |                            |                       |     measurements."DEPTH_quality_control",                                              |
|                       |                            |                       |     measurements."BOT_DEPTH",                                                          |
|                       |                            |                       |     measurements."BOT_DEPTH_quality_control",                                          |
|                       |                            |                       |     measurements."PRES_REL",                                                           |
|                       |                            |                       |     measurements."PRES_REL_quality_control",                                           |
|                       |                            |                       |     measurements."TEMP",                                                               |
|                       |                            |                       |     measurements."TEMP_quality_control",                                               |
|                       |                            |                       |     measurements."PSAL",                                                               |
|                       |                            |                       |     measurements."PSAL_quality_control",                                               |
|                       |                            |                       |     measurements."DOX2",                                                               |
|                       |                            |                       |     measurements."DOX2_quality_control",                                               |
|                       |                            |                       |     measurements."TURB",                                                               |
|                       |                            |                       |     measurements."TURB_quality_control",                                               |
|                       |                            |                       |     measurements."CHLF",                                                               |
|                       |                            |                       |     measurements."CHLF_quality_control",                                               |
|                       |                            |                       |     measurements."CHLU",                                                               |
|                       |                            |                       |     measurements."CHLU_quality_control",                                               |
|                       |                            |                       |     measurements."CPHL",                                                               |
|                       |                            |                       |     measurements."CPHL_quality_control",                                               |
|                       |                            |                       |     measurements."CNDC",                                                               |
|                       |                            |                       |     measurements."CNDC_quality_control",                                               |
|                       |                            |                       |     measurements."DESC",                                                               |
|                       |                            |                       |     measurements."DESC_quality_control",                                               |
|                       |                            |                       |     measurements."DENS",                                                               |
|                       |                            |                       |     measurements."DENS_quality_control",                                               |
|                       |                            |                       |     a.geom                                                                             |
|                       |                            |                       |    FROM (anmn_nrs_ctd_profiles.measurements                                            |
|                       |                            |                       |      JOIN anmn_nrs_ctd_profiles.deployments a ON ((measurements.file_id = a.file_id))) |
|                       |                            |                       |   ORDER BY measurements.file_id, measurements."TIME", measurements."DEPTH";            |
+-----------------------+----------------------------+-----------------------+----------------------------------------------------------------------------------------+

However, doing a listing of indexes on this table shows that there is no index on "TIME",

select *  from pg_indexes where schemaname like '%anmn_nrs_ctd_profiles%' and tablename = 'measurements'
+-----------------------+--------------+----------------------------+--------------+----------------------------------------------------------------------------------------------------------------------+
| schemaname            | tablename    | indexname                  | tablespace   | indexdef                                                                                                             |
|-----------------------+--------------+----------------------------+--------------+----------------------------------------------------------------------------------------------------------------------|
| anmn_nrs_ctd_profiles | measurements | measurements_time_dpth_idx | <null>       | CREATE INDEX measurements_time_dpth_idx ON anmn_nrs_ctd_profiles.measurements USING btree (file_id, "TIME", "DEPTH") |
| anmn_nrs_ctd_profiles | measurements | measurements_file_id_idx   | <null>       | CREATE INDEX measurements_file_id_idx ON anmn_nrs_ctd_profiles.measurements USING btree (file_id)                    |
| anmn_nrs_ctd_profiles | measurements | measurements_pkey          | <null>       | CREATE UNIQUE INDEX measurements_pkey ON anmn_nrs_ctd_profiles.measurements USING btree (measurement)                |
+-----------------------+--------------+----------------------------+--------------+----------------------------------------------------------------------------------------------------------------------+

This should be fixed in the ANMN_NRS_CTD_PROFILES harvester

lbesnard commented 2 years ago

There probably should be an index for portal users such as

CREATE INDEX measurement_utc_time_idx ON measurement USING btree (timezone('UTC'::text, "TIME")) 

however, the queries I'm running have to be done on "TIME" and not timezone(**) so the creation of the above index probably wouldn't affect me.

I ran in a similar issue with the anmn_ts schema and the measurement table which has the following indexes:

select * from pg_indexes where schemaname = 'anmn_ts' and tablename = 'measurement'
+--------------+-------------+--------------------------+--------------+----------------------------------------------------------------------------------------------------------+
| schemaname   | tablename   | indexname                | tablespace   | indexdef                                                                                                 |
|--------------+-------------+--------------------------+--------------+----------------------------------------------------------------------------------------------------------|
| anmn_ts      | measurement | measurement_utc_time_idx | <null>       | CREATE INDEX measurement_utc_time_idx ON anmn_ts.measurement USING btree (timezone('UTC'::text, "TIME")) |
| anmn_ts      | measurement | measurement_fk           | <null>       | CREATE UNIQUE INDEX measurement_fk ON anmn_ts.measurement USING btree (ts_id, index)                     |
+--------------+-------------+--------------------------+--------------+----------------------------------------------------------------------------------------------------------+

The query I'm currently running has been going on for 20 hours

ggalibert commented 2 years ago

@lbesnard , would be good to look on the portal if this is really worth doing and is relevant with filtering use cases.

lbesnard commented 2 years ago

doesn't seem to affect WFS data download