Closed andersy005 closed 5 years ago
It's not clear to me that it's feasible to store CSV files as some will be too big. Should we consider using intake to reference accessible locations of CSV files? Does this overlap with aletheia-data?
Does this overlap with aletheia-data?
Yes, there's some overlap with what we are planning on doing with the aletheia-data
. Since pandas can read remote csv files, for the Pangeo use case, a user can just pass in https://storage.googleapis.com/cmip6/cmip6-zarr-consolidated-stores.csv
or any other remote url and everything will work. For use cases that generate big csv files like CMIP6/CESM data holdings on Glade, we would need to store the csv files into the FTP server.
So it seems like this repo should be a catalog of collections, but the collections themselves are stored elsewhere.
We should model this repo on https://github.com/radiantearth/stac-spec
The actual catalogs themselves do not live here. What lives here is the specification itself (documentation), plus tools to validate catalogs and examples.
Closing this for the time being as it's been addressed in previous PRs.
How should we structure the repository?
Cc @matt-long, @rabernat