get a more unified dataset api

andregraubner / ClimateNet

Climate Analytics using Deep Neural Networks in Python.

https://www.nersc.gov/research-and-development/data-analytics/big-data-center/climatenet/

MIT License

59 stars 25 forks source link

get a more unified dataset api #4

Closed andregraubner closed 3 years ago

andregraubner commented 3 years ago

currently the implementations for labeled datasets and unlabelled datasets are different (because the unlabelled ALLHIST dataset has a different structure from the expert labels). We should probably unify that (and save the expert labels online accordingly).

andregraubner commented 3 years ago

Proposed structure: All samples are nc files with one group "Data" and (if labeled) one group "Labels". Inside the data group we have data with the named dimensions "time", "variable", "lat", "lon".

This seems general enough to be useful while still simplifying the data loading (and makes code more readable and maintainable, because no dimensions have to be juggled around).