aodn / content

Tracks AODN Portal content and configuration issues
0 stars 0 forks source link

Faimms downloads timing out #348

Closed jonescc closed 6 years ago

jonescc commented 6 years ago

To reproduce

Select faimms and download as csv

What happens

Times out starting download

What I expect to happen

Starts downloading.

jonescc commented 6 years ago

Another instance of this yesterday for a filtered download

From the geoserver logs last night, the following download timed out for an end user:

http://geoserver-portal.aodn.org.au/geoserver/ows?typeName=faimms_timeseries_data&SERVICE=WFS&outputFormat=csv-with-metadata-header&REQUEST=GetFeature&VERSION=1.0.0&CQL_FILTER=INTERSECTS(geom%2CPOLYGON((143.52676391601%20-26.869415283205%2C143.52676391601%20-12.015899658205%2C155.39199829101%20-12.015899658205%2C155.39199829101%20-26.869415283205%2C143.52676391601%20-26.869415283205)))&downloadToken=1533714674345

Running the equivalent query in postgres shows the following query plan - noting that it took over 22 minutes to return any records:

"Hash Join  (cost=25527383.14..27957089.80 rows=21114595 width=273) (actual time=1379617.175..2197878.507 rows=93391133 loops=1)"
"  Hash Cond: (measurements_noqaqc.file_id = table_a.id)"
"  ->  Merge Right Join  (cost=25524545.68..26622237.76 rows=21114595 width=90) (actual time=1379564.964..1859505.118 rows=93391133 loops=1)"
"        Merge Cond: ((measurements_qaqc."VALUES" = measurements_noqaqc."VALUES") AND (measurements_qaqc."TIME" = measurements_noqaqc."TIME") AND (measurements_qaqc.channel_id = measurements_noqaqc.channel_id))"
"        ->  Sort  (cost=17904121.91..18113836.93 rows=83886008 width=18) (actual time=267126.703..379564.282 rows=83616936 loops=1)"
"              Sort Key: measurements_qaqc."VALUES", measurements_qaqc."TIME", measurements_qaqc.channel_id"
"              Sort Method: external merge  Disk: 2796176kB"
"              ->  Seq Scan on measurements_qaqc  (cost=0.00..1702955.08 rows=83886008 width=18) (actual time=1.235..55059.048 rows=84219077 loops=1)"
"        ->  Materialize  (cost=7620423.77..7725996.74 rows=21114595 width=88) (actual time=1112438.250..1320906.533 rows=93391133 loops=1)"
"              ->  Sort  (cost=7620423.77..7673210.25 rows=21114595 width=88) (actual time=1112438.244..1238604.652 rows=93363282 loops=1)"
"                    Sort Key: measurements_noqaqc."VALUES", measurements_noqaqc."TIME", measurements_noqaqc.channel_id"
"                    Sort Method: external merge  Disk: 9956016kB"
"                    ->  Hash Join  (cost=93.40..2020541.38 rows=21114595 width=88) (actual time=0.783..179170.895 rows=93363282 loops=1)"
"                          Hash Cond: (measurements_noqaqc.channel_id = table_b.channel_id)"
"                          ->  Seq Scan on measurements_noqaqc  (cost=0.00..1571762.84 rows=63343784 width=22) (actual time=0.014..72954.321 rows=93363282 loops=1)"
"                          ->  Hash  (cost=92.11..92.11 rows=103 width=70) (actual time=0.745..0.745 rows=309 loops=1)"
"                                Buckets: 1024  Batches: 1  Memory Usage: 39kB"
"                                ->  Seq Scan on timeseries_feature_deployments_varname table_b  (cost=0.00..92.11 rows=103 width=70) (actual time=0.062..0.539 rows=309 loops=1)"
"                                      Filter: ((geom && '0103000020E610000001000000050000003AFFFF3FDBF061401002000092DE3AC03AFFFF3FDBF0614020040000240828C03AFFFF3F8B6C634020040000240828C03AFFFF3F8B6C63401002000092DE3AC03AFFFF3FDBF061401002000092DE3AC0'::ge (...)"
"  ->  Hash  (cost=1584.98..1584.98 rows=35798 width=155) (actual time=51.888..51.888 rows=35682 loops=1)"
"        Buckets: 32768  Batches: 2  Memory Usage: 3592kB"
"        ->  Seq Scan on indexed_file table_a  (cost=0.00..1584.98 rows=35798 width=155) (actual time=0.025..24.744 rows=35682 loops=1)"
"Planning time: 1.086 ms"
"Execution time: 2238341.376 ms"

Looks to me like postgres is really, really struggling to join faimms.measurements_qaqc and faimms.measurements_noqaqc for a large number of selected records in real time. Perhaps a materialised view or some other mechanism for generating this combined result set and maintaining it may give better results and reduce the load on the system?

lbesnard commented 6 years ago

https://github.com/aodn/harvesters/pull/629