Closed emimaesmith closed 4 years ago
Adding timings to the different parts of the get_status
method reveals the majority of the time is spent grouping the results by time.
A few optimizations can be made on the db level for ingests.
For /ingests, some massive improvements:
For /ingests/status:
/ingests/status isn't too much better but there isn't much else to improve. There is a bit more computation going on at the Python level than with other requests, but you can see the overhead time (difference in db queries vs full HTTP request) isn't that much more. The single query that is holding most of this up:
SELECT ••• FROM "ingest" LEFT OUTER JOIN "strike" ON ("ingest"."strike_id" = "strike"."id") WHERE "ingest"."status" = 'INGESTED' ORDER BY "ingest"."data_started" ASC
I'm guessing it's just due to the size of the database and that inner join.
Description The REST API for /ingests/status takes too long to return when the database is very large.
Reproduction Steps Steps to reproduce the problem:
Expected behavior The /ingest/status should return results - not a 504.
Related to #1831