Open peterdudfield opened 2 years ago
Test results
This was done on different gsp locations With no datetim interval [s] 15(gsp=35) 10 (gsp 99) 15 (gsp 189) 12 (gsp 250) 6 (gsp 259) 18 (gsp 121) 18 (gsp 200) 12 (gsp 150) 2 (gsp 2) 2 (gsp 4) 1 (gsp 6) 2 (gsp 8)
mean = 7
With no datetime interval [s] 10 (gsp=67) 2 (gsp 72) 20 (gsp 267) 15 (gsp 135) 6 (gsp 259) 10 (gsp 301) 4 (gsp 316) 8 (gsp 100) 19 (gsp 10) 3 (gsp 1) 2 (gsp 3) 2 (gsp 5) 2 (gsp 7)
mean = 8
I tried to make an index, but it took > 10 mins
CREATE INDEX forecast_value_2022_10_datetime_interval_id on forecast_value_2022_10(datetime_interval_id)
EXPLAIN ANALYZE Select DISTINCT ON (forecast_value.target_time) forecast_value.created_utc AS forecast_value_created_utc, forecast_value.uuid AS forecast_value_uuid, forecast_value.target_time AS forecast_value_target_time, forecast_value.expected_power_generation_megawatts AS forecast_value_expected_power_generation_megawatts, forecast_value.forecast_id AS forecast_value_forecast_id
FROM forecast_value
JOIN forecast ON forecast.id = forecast_value.forecast_id
JOIN location ON location.id = forecast.location_id
JOIN datetime_interval ON forecast_value.datetime_interval_id = datetime_interval.id
AND location.gsp_id = 211
-- and datetime_interval.start_datetime_utc >= '2022-10-25'
-- and datetime_interval.end_datetime_utc <= '2022-10-26'
and forecast_value.target_time >= '2022-10-25'
and forecast_value.target_time <= '2022-10-26'
AND forecast_value.created_utc >= '2022-10-25'
AND forecast_value.target_time - forecast_value.created_utc >= interval '240 minute'
AND forecast_value.created_utc < '2022-10-26 13:01'
AND forecast.created_utc >= '2022-10-25'
ORDER BY forecast_value.target_time, forecast_value.created_utc DESC;
Could also add index on creation_utc - CREATE INDEX forecast_value_2022_10_created_utc on forecast_value_2022_10(created_utc DESC)
It took 20 mins, but maybe it could have been quicker
Other option
create INDEX idx_forecast_created_utc on forecast(created_utc)
(THIS SEEMS TO HELP) Also did index on forecast_value - created_utc. They both probably helpWould be good to bench makr against production to check the changes are doing the desired effect
Above 3. should mean that there is atmost 7 days of data to read. This should make the fast to load. Lets review in 1 weeks time
Detailed Description
Currently takes ~ 6 seconds for each GSP to read forecast value table when looking at a X hour forecast.
Query at the momment is
We have already partitation this table, but there are about 36 million rows per month. This means by the end of the month, its a bit slow. ~ 6 seconds per query. Note if you query again its quicker (this is how SQL works)
Context
Possible Implementation
ALTER TABLE forecast_value ADD datetime_interval_id int;
2 . update and add datetime_intervals 3 . then run updated query, to see if it speeds up. If it does, can then add it in