pangeo-data / pangeo-cmip6-cloud

Documentation for Pangeo CMIP6 data stored in GCP/AWS cloud
https://pangeo-data.github.io/pangeo-cmip6-cloud/
17 stars 9 forks source link

Handle retractions for CMIP6 netcdf bucket in AWS #33

Open aradhakrishnanGFDL opened 2 years ago

aradhakrishnanGFDL commented 2 years ago

This issue is similar to the one opened here, but for the netcdf holdings in the cloud.

A CI pipeline to handle retractions for the esgf-world in the corresponding intake-esm catalog

jbusecke commented 2 years ago

I think there is a bunch of logic, that can be reused here. Happy to refactor as needed.

aradhakrishnanGFDL commented 2 years ago

I think there is a bunch of logic, that can be reused here. Happy to refactor as needed.

Thanks Julius. Using your original actions workflow, I was able to set up one to remove retracted datasets in esgf-world (cmip6 netcdf) intake-esm catalog using this workflow in my repo. After further testing, I plan to move it to pangeo-cmip6-cloud.

jbusecke commented 2 years ago

Awesome! I still need to do the S3 zarr catalog, but this should help with that. @rabernat and I have been chatting briefly about moving these catalogs to some more robust structure (like big query/red shift). We could then produce a csv from that central source. Happy to chat more about it in todays CMIP meeting.