JuliaDataCubes / YAXArrays.jl

Yet Another XArray-like Julia package
https://juliadatacubes.github.io/YAXArrays.jl/
Other
101 stars 17 forks source link

Sampling the data cube for machine learning #154

Open gdkrmr opened 2 years ago

gdkrmr commented 2 years ago

I have written a simple block sampler in space. Because there is quite a lot of work going on with deep learning this seems like something this should be available somehow.

Where should something like this go? What is the best sampling strategy on an irregular grid? Is this a generic enough problem so that not everyone will end up writing their own sampler anyways?

meggart commented 2 years ago

I think that efficient sampling from chunked data is still an unsolved problem, of course depending on the use case, but I have seen a lot of inefficient workflows in the past. My current attempt for a time series sampler is here https://github.com/meggart/DiskArrayShufflers.jl sorry that it is still completely undocumented, but it is written in a way to be extensible to spatial and spatiotemporal block samplers. If you are interested we can discuss details in a call or so.

To answer the question I would currently not add a sampler to YAXArrays, but would rather keep experimenting in small packages that sit on top of YAXArrays.