openclimatefix / ocf_datapipes

OCF's DataPipe based dataloader for training and inference
MIT License
13 stars 11 forks source link

Add Caching to Disk in Datapipe #182

Closed jacobbieker closed 10 months ago

jacobbieker commented 1 year ago

Pull Request

Description

This gives the option to cache to disk the outputs of some datapipes, and gives examples in the training pipeline on how to do so. This should help with compute intensive data pipelines.

Fixes #

How Has This Been Tested?

Unit tests and running it.

Checklist:

codecov[bot] commented 1 year ago

Codecov Report

Merging #182 (57ba8db) into main (86fa3de) will decrease coverage by 0.06%. The diff coverage is 8.33%.

:exclamation: Current head 57ba8db differs from pull request most recent head 4a21574. Consider uploading reports for the commit 4a21574 to get more accurate results

@@            Coverage Diff             @@
##             main     #182      +/-   ##
==========================================
- Coverage   77.79%   77.73%   -0.06%     
==========================================
  Files         124      124              
  Lines        5097     5106       +9     
==========================================
+ Hits         3965     3969       +4     
- Misses       1132     1137       +5     
Impacted Files Coverage Δ
ocf_datapipes/training/metnet_pv_site.py 22.98% <8.33%> (-1.38%) :arrow_down:

... and 1 file with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

jacobbieker commented 10 months ago

This has already partly been done in #238