openclimatefix / nowcasting_dataset

Prepare batches of data for training machine learning solar electricity nowcasting data
https://nowcasting-dataset.readthedocs.io/en/stable/
MIT License
24 stars 6 forks source link

Document the contents & shapes of the NetCDF files that `nowcasting_dataset` outputs? #227

Open JackKelly opened 2 years ago

JackKelly commented 2 years ago

The Pydantic models in the data_sources/<modality>/<modality>_model.py files (where <modality> is one of {datetime, metadata, gsp, nwp, pv, satellite, sun, topographic}) describe the contents of the batches (and, hence, the contents of the NetCDF files). (For example, see satellite_model.py).

But it might be nice to aggregate all that documentation together into a single human-readable page somewhere? So it's really quick and easy for users to get an overview (and to see the detail) of precisely what nowcasting_dataset outputs to disk?

I guess the ideal situation would be to automatically extract the description fields from the pydantic models? I wonder if sphinx can do that?!

Related to #222

JackKelly commented 2 years ago

I guess the ideal situation would be to automatically extract the description fields from the pydantic models? I wonder if sphinx can do that?!

Yup, it can! https://sphinx-pydantic.readthedocs.io/en/latest/

peterdudfield commented 2 years ago

The 'data_sources' are now basically 'xr.Datasets', with a few pydantic features.

Perhaps a good start is just to write a descirption in eahc of the data_sources/<modality>/<modality>_model.py files