openclimatefix / ocf_datapipes

OCF's DataPipe based dataloader for training and inference
MIT License
13 stars 11 forks source link

Add NWP contiguous function to allow stale NWP in training #239

Closed dfulu closed 9 months ago

dfulu commented 9 months ago

Pull Request

Description

The current .get_contiguous_time_periods() function seems to be primarily built for linear time series. It's not ideal for NWP-like forecast products. The new .get_contiguous_time_periods_nwp() function allows us to filter NWP init times based on a maximum staleness we will accept in the init time.

Some example timelines between the get_contiguous_time_periods() and .get_contiguous_time_periods_nwp() are shown below. The titles of each figure show the setting used - e.g. the top left used a history duration of 0, forecast duration of 6 hours and maximum staleness of 9 hours.

In the figure, the blue dots indicate the available NWP forecast times. The blue lines show how long before each init time becomes "stale". The red lines show how long before each init time has the required history length available. In this example dataset many forecast init times are missing - forecasts init times are missing a lot in our UKV 2022 and 2023 data.

The black line shows the contiguous periods found by .get_contiguous_time_periods_nwp(). The green line shows the periods found by the older .get_contiguous_time_periods().

download (24)

Checklist:

codecov[bot] commented 9 months ago

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (dd64df6) 78.44% compared to head (2d67722) 78.60%. Report is 5 commits behind head on main.

Files Patch % Lines
ocf_datapipes/training/common.py 96.42% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #239 +/- ## ========================================== + Coverage 78.44% 78.60% +0.16% ========================================== Files 130 129 -1 Lines 5668 5693 +25 ========================================== + Hits 4446 4475 +29 + Misses 1222 1218 -4 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.