Structure of the repo - Githubissues

NCAR / esm-collection-spec

Earth System Model Collection specification

Apache License 2.0

13 stars 7 forks source link

Structure of the repo #1

Closed andersy005 closed 5 years ago

andersy005 commented 5 years ago

How should we structure the repository?

Cc @matt-long, @rabernat

matt-long commented 5 years ago

It's not clear to me that it's feasible to store CSV files as some will be too big. Should we consider using intake to reference accessible locations of CSV files? Does this overlap with aletheia-data?

https://github.com/NCAR/aletheia-data

andersy005 commented 5 years ago

Does this overlap with aletheia-data?

Yes, there's some overlap with what we are planning on doing with the aletheia-data. Since pandas can read remote csv files, for the Pangeo use case, a user can just pass in https://storage.googleapis.com/cmip6/cmip6-zarr-consolidated-stores.csv or any other remote url and everything will work. For use cases that generate big csv files like CMIP6/CESM data holdings on Glade, we would need to store the csv files into the FTP server.

matt-long commented 5 years ago

So it seems like this repo should be a catalog of collections, but the collections themselves are stored elsewhere.

rabernat commented 5 years ago

We should model this repo on https://github.com/radiantearth/stac-spec

The actual catalogs themselves do not live here. What lives here is the specification itself (documentation), plus tools to validate catalogs and examples.

andersy005 commented 5 years ago

Closing this for the time being as it's been addressed in previous PRs.