cedadev / cmip6-object-store

CMIP6 Object Store Library
BSD 3-Clause "New" or "Revised" License
4 stars 4 forks source link

Create a simple inventory of all datasets to be included #5

Closed agstephens closed 4 years ago

agstephens commented 4 years ago

Create a single inventory file of a list of datasets that we should replicate to object store.

Output should be a text file with one dataset ID per line, e.g.:

CMIP6.ScenarioMIP.MPI-M.MPI-ESM1-2-LR.ssp126.r1i1p1f1.Amon.zg.gn.v20190710
CMIP6.ScenarioMIP.MPI-M.MPI-ESM1-2-LR.ssp126.r1i1p1f1.CF3hr.psl.gn.v20190815
...

Derive the list from the IPCC WG1 priority list. Initially, about 20Tb will be a great start.

RuthPetrie commented 4 years ago

cmip6-datasets.txt

OK I have a list it isn't quite as requested Ag... (sorry) each line has a dataset id, the number of files and the total size of the dataset, I thought this information would be useful and the file can trivially be made into the format you requested. The total here is ~33Tb so a bit more than requested but definitely enough to start with let me know if you want any more work done with this list.

agstephens commented 4 years ago

Hi @RuthPetrie , please can you do an update and create a bigger one. Many thanks

RuthPetrie commented 4 years ago

@agstephens An updated list of datasets is available at https://github.com/cedadev/cmip6-object-store/blob/master/data/cmip6-datasets_2020-10-27.csv.gz 188.8 TB

agstephens commented 4 years ago

Thanks Ruth, that's great!