spaceml-org / helionb-sdoml

Heliophysics notebooks corresponding to the SDO ML Dataset
GNU General Public License v3.0
9 stars 1 forks source link

Pytorch dataloader #7

Open mariusgiger opened 2 years ago

mariusgiger commented 2 years ago

Hi all,

Thanks for the great work to provide more suitable primitives for Machine Learning in Heliophysics!

It would be great to have a Pytorch DataLoader for the SDO ML v2 dataset in order to ease data loading and facilitate reproducibility.

Ideally the DataLoader should:

A potential starting point can be found here: https://github.com/i4Ds/awesome-helio/pull/10/files (still a few TODOs) - feedback is welcome.

Cheers, Marius

PaulJWright commented 2 years ago

That would be great! I will take a look at the one there when I get a moment; please do feel free to do a PR to incorporate your DataLoader when it's finished!

mariusgiger commented 2 years ago

A more advanced version can be found here: https://github.com/i4Ds/sdo-cli/blob/main/src/sdo/sood/data/sdo_ml_v2_dataset.py, still a few open issues but it solves some of the features mentioned above.

PaulJWright commented 2 years ago

Thanks Marius. This looks great!

I am toying with developing a DataLoader, but we can definitely link to https://github.com/i4Ds/sdo-cli/blob/main/src/sdo/sood/data/sdo_ml_v2_dataset.py from the SDOML github?

mariusgiger commented 2 years ago

Hi @PaulJWright, sure you can link it. Not all the things I have put there will be needed by others but it can serve as an inspiration.

Let me know if you need some help.

Cheers!

PaulJWright commented 2 years ago

Great, will do @mariusgiger! I like yours a lot, so I think I will keep the one i'm developing for the most basic use-cases (the notebooks here, for example), and direct people over to yours for more complicated things!