pangeo-data / pangeo-datastore

Pangeo Cloud Datastore
https://catalog.pangeo.io
48 stars 16 forks source link

Cannot load variable through s3fs but can brown on AWS S3 Explorer #131

Open diptiSH opened 2 years ago

diptiSH commented 2 years ago

Hello,

I am a new user of AWS datastore. I want to get 'sftlf' variable on AWS I am using the following code -

Connect to AWS S3 storage

fs = s3fs.S3FileSystem(anon=True) df = pd.read_csv("https://cmip6-pds.s3.amazonaws.com/pangeo-cmip6.csv") qstring="activity_id=='CMIP' & institution_id =='NCC' & source_id=='NorESM2-LM' & table_id=='fx' & experiment_id=='historical' & member_id=='r1i1p1f1' & variable_id=='sftlf'" df.query(qstring)

This returns the empty string.

I am not getting it through s3fs but I can browse it on AWS S3 Explorer. https://cmip6-pds.s3.amazonaws.com/index.html#CMIP6/CMIP/NCC/NorESM2-LM/historical/r1i1p1f1/fx/sftlf/gn/v20190815/sftlf/

rabernat commented 2 years ago

Thanks for your question.

This probably means that the data have been retracted. They may still be in s3 but are not exposed by the catalog. cc @jbusecke.

jbusecke commented 2 years ago

AFAIK there have been no retractions on AWS so far (at least I havent done any).

But @rabernat advice is generally right. @diptiSH we usually do not delete stores, just unlink them from the catalog. This is done so that retracted or otherwise 'wonky' datasets can still be used by users who specifically want this. May I ask if there was a specific page that recommended the AWS explorer route? We might want to add a warning there then.