openclimatefix / satflow

Satellite Optical Flow with machine learning models
https://satflow.readthedocs.io/en/stable/
MIT License
61 stars 10 forks source link

Iterable Satflow dataset #16

Closed jacobbieker closed 3 years ago

jacobbieker commented 3 years ago

Should use a lot less RAM, as loading att ~144 sets of images grows to more than 30GB of ram at once. Is pushing me to more go towards Zarr probably, but one last try for webdataset for this thing. A lot more limitations with this one compared to the other SatFlowDataset, but tradeoffs it for memory. No random access, and have to iterate through the entire day before knowing whether we can use the samples for training too, so lots of possibly wasted IO. Other medium option would be to split into hour long chunks or something for training optical flow. Makes it still not possible to easily train for the 4 hours ahead