Open savannahferretti opened 9 months ago
Thanks so much for the request. I am working on it in #102.
@jbusecke Any updates on this?
HI @savannahferretti, sorry for the delay here. I just got back from vacation. I did observe some errors when I ran this a while back, but I think these did not affect all datasets during testing. Is having a subset of your request online valuable for you at all?
I am working on #102 now. I hope that we can get some iids to complete. There are definitely some of the requested ones that seems to be 'hard to get' from the ESGF side, so those might require more patience.
I just merged #102, and expect some of those datasets to become available soon. You can run this code on the hub to check on them being available:
def zstore_to_iid(zstore: str):
# this is a bit whacky to account for the different way of storing old/new stores
return '.'.join(zstore.replace('gs://','').replace('.zarr','').replace('.','/').split('/')[-11:-1])
iids_requested = [
'CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r4i1p1f1.3hr.pr.gn.v20210607',
'CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r4i1p1f1.6hrLev.hus.gn.v20210921',
'CMIP6.CMIP.CSIRO.ACCESS-ESM1-5.historical.r1i1p1f1.3hr.pr.gn.v20191115',
'CMIP6.CMIP.CSIRO.ACCESS-ESM1-5.historical.r1i1p1f1.6hrLev.ta.gn.v20191115',
'CMIP6.CMIP.CSIRO.ACCESS-ESM1-5.historical.r1i1p1f1.6hrLev.hus.gn.v20191115',
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.3hr.pr.gn.v20200212',
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.6hrLev.ta.gn.v20200212',
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.6hrLev.hus.gn.v20200212',
'CMIP6.CMIP.NCAR.CESM2.historical.r11i1p1f1.E1hr.pr.gn.v20190514',
'CMIP6.CMIP.NCAR.CESM2.historical.r11i1p1f1.6hrLev.ta.gn.v20200210',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.3hr.pr.gn.v20200616',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.6hrLev.ta.gn.v20200616',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.6hrLev.hus.gn.v20200616',
'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.3hr.pr.gn.v20210114',
'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.6hrLev.ta.gn.v20210114',
'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.6hrLev.hus.gn.v20210114',
'CMIP6.CMIP.CCCma.CanESM5.historical.r1i1p2f1.3hr.pr.gn.v20190429',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.3hr.pr.gn.v20190826',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.6hrLev.ta.gn.v20190826',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.6hrLev.hus.gn.v20190826',
'CMIP6.CMIP.MOHC.HadGEM3-GC31-LL.historical.r1i1p1f3.3hr.pr.gn.v20201103',
'CMIP6.CMIP.MOHC.HadGEM3-GC31-MM.historical.r1i1p1f3.3hr.pr.gn.v20200720',
'CMIP6.CMIP.MOHC.HadGEM3-GC31-MM.historical.r1i1p1f3.6hrLev.ta.gn.v20201019',
'CMIP6.CMIP.MOHC.HadGEM3-GC31-MM.historical.r1i1p1f3.6hrLev.hus.gn.v20201019',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.3hr.pr.gn.v20191226',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.6hrLev.ta.gn.v20191226',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.6hrLev.hus.gn.v20191226',
'CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.3hr.pr.gn.v20191129',
'CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.6hrLev.ta.gn.v20191129',
'CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.6hrLev.hus.gn.v20191129',
'CMIP6.CMIP.MIROC.MIROC6.historical.r1i1p1f1.3hr.pr.gn.v20190912',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.3hr.pr.gn.v20190627',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.6hrLev.ta.gn.v20190627',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.6hrLev.hus.gn.v20190627',
'CMIP6.CMIP.MPI-M.MPI-ESM1-2-LR.historical.r1i1p1f1.3hr.pr.gn.v20190710',
'CMIP6.CMIP.MPI-M.MPI-ESM1-2-HR.historical.r1i1p1f1.3hr.pr.gn.v20190710',
'CMIP6.CMIP.NUIST.NESM3.historical.r1i1p1f1.3hr.pr.gn.v20190630',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.3hr.pr.gn.v20230616',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.6hrLev.ta.gn.v20191108',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.6hrLev.hus.gn.v20191108',
'CMIP6.CMIP.SNU.SAM0-UNICON.historical.r1i1p1f1.3hr.pr.gn.v20190323',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.3hr.pr.gn.v20201013',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.6hrLev.ta.gn.v20201112',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.6hrLev.hus.gn.v20201112',
'CMIP6.CMIP.MOHC.UKESM1-0-LL.historical.r1i1p1f2.3hr.pr.gn.v20200507',
]
import intake
# uncomment/comment lines to swap catalogs
url = "https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog.json"
col = intake.open_esm_datastore(url)
iids_all= [zstore_to_iid(z) for z in col.df['zstore'].tolist()]
iids_uploaded = [iid for iid in iids_all if iid in iids_requested]
iids_uploaded
Note that I updated the code above. We just ingested the first two datasets successfully:
['CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.3hr.pr.gn.v20210114',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.3hr.pr.gn.v20230616']
If you could check those out and see if everything looks ok, that would be really valuable!
Making progress!
['CMIP6.CMIP.MOHC.HadGEM3-GC31-LL.historical.r1i1p1f3.3hr.pr.gn.v20201103', 'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.3hr.pr.gn.v20200616', 'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.3hr.pr.gn.v20210114', 'CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r4i1p1f1.3hr.pr.gn.v20210607', 'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.3hr.pr.gn.v20230616', 'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.3hr.pr.gn.v20200212']
Please check out my new instructions on how to check with the progress here!
@jbusecke I used the progress code to check which datasets are ingested, and it matched the 6 you gave above. Also did a test plot of the model precipitation, and all seems reasonable so far!
Also, in terms of prioritizing uploads, can you focus on the following datasets (excludes models with data on hybrid height levels):
iids_requested = [
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.3hr.pr.gn.v20200212',
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.6hrLev.ta.gn.v20200212',
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.6hrLev.hus.gn.v20200212',
'CMIP6.CMIP.NCAR.CESM2.historical.r11i1p1f1.E1hr.pr.gn.v20190514',
'CMIP6.CMIP.NCAR.CESM2.historical.r11i1p1f1.6hrLev.ta.gn.v20200210',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.3hr.pr.gn.v20200616',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.6hrLev.ta.gn.v20200616',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.6hrLev.hus.gn.v20200616',
'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.3hr.pr.gn.v20210114',
'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.6hrLev.ta.gn.v20210114',
'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.6hrLev.hus.gn.v20210114',
'CMIP6.CMIP.CCCma.CanESM5.historical.r1i1p2f1.3hr.pr.gn.v20190429',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.3hr.pr.gn.v20190826',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.6hrLev.ta.gn.v20190826',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.6hrLev.hus.gn.v20190826',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.3hr.pr.gn.v20191226',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.6hrLev.ta.gn.v20191226',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.6hrLev.hus.gn.v20191226',
'CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.3hr.pr.gn.v20191129',
'CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.6hrLev.ta.gn.v20191129',
'CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.6hrLev.hus.gn.v20191129',
'CMIP6.CMIP.MIROC.MIROC6.historical.r1i1p1f1.3hr.pr.gn.v20190912',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.3hr.pr.gn.v20190627',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.6hrLev.ta.gn.v20190627',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.6hrLev.hus.gn.v20190627',
'CMIP6.CMIP.MPI-M.MPI-ESM1-2-LR.historical.r1i1p1f1.3hr.pr.gn.v20190710',
'CMIP6.CMIP.MPI-M.MPI-ESM1-2-HR.historical.r1i1p1f1.3hr.pr.gn.v20190710',
'CMIP6.CMIP.NUIST.NESM3.historical.r1i1p1f1.3hr.pr.gn.v20190630',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.3hr.pr.gn.v20230616',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.6hrLev.ta.gn.v20191108',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.6hrLev.hus.gn.v20191108',
'CMIP6.CMIP.SNU.SAM0-UNICON.historical.r1i1p1f1.3hr.pr.gn.v20190323',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.3hr.pr.gn.v20201013',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.6hrLev.ta.gn.v20201112',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.6hrLev.hus.gn.v20201112',
]
Thank you so much for all your hard work!
Hi @savannahferretti, I have currently no way to really prioritize the upload unfortunately. If the ESGF API gives me the URLs back and they are accessible (both not always given 🤔) then they will be run.
FYI I think I just saw a couple of these get through...Some more will definitely trickle in in my experience.
@jbusecke Thanks for the update! Was unsure about your process, so I figured if all requests weren't already submitted that these would be preferred to come first. I see that it's now a waiting game with ESGF, and will keep tabs on what's uploaded as they come in!
Just checked again, and seems like we got a few hits ( Still missing 29 of 49, I expect more to come soon, since I just fixed a major bug!)
Just wanted to point out https://github.com/leap-stc/cmip6-leap-feedstock/pull/163, which will hopefully lead to more of those missing datasets to come online.
Just did a check in on the downloaded datasets. As of today (June 14), here is what s loaded:
Found in catalog='qc':
iids=['CMIP6.CMIP.MPI-M.MPI-ESM1-2-HR.historical.r1i1p1f1.3hr.pr.gn.v20190710',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.3hr.pr.gn.v20200616',
'CMIP6.CMIP.MPI-M.MPI-ESM1-2-LR.historical.r1i1p1f1.3hr.pr.gn.v20190710',
'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.3hr.pr.gn.v20210114',
'CMIP6.CMIP.SNU.SAM0-UNICON.historical.r1i1p1f1.3hr.pr.gn.v20190323',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.3hr.pr.gn.v20230616',
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.3hr.pr.gn.v20200212',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.3hr.pr.gn.v20190627',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.3hr.pr.gn.v20191226',
'CMIP6.CMIP.CCCma.CanESM5.historical.r1i1p2f1.3hr.pr.gn.v20190429']
Found in catalog='non-qc':
iids=['CMIP6.CMIP.NUIST.NESM3.historical.r1i1p1f1.3hr.pr.gn.v20190630',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.3hr.pr.gn.v20201013']
Found in catalog='retracted':
iids=['CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.3hr.pr.gn.v20191129',
'CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.6hrLev.ta.gn.v20191129',
'CMIP6.CMIP.MIROC.MIROC6.historical.r1i1p1f1.3hr.pr.gn.v20190912',
'CMIP6.CMIP.MIROC.MIROC-ES2L.historical.r1i1p1f2.6hrLev.hus.gn.v20191129']
Still missing 19 of 35:
missing_iids=['CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.6hrLev.hus.gn.v20210114',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.6hrLev.ta.gn.v20191226',
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.6hrLev.ta.gn.v20200212',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.6hrLev.hus.gn.v20200616',
'CMIP6.CMIP.CCCR-IITM.IITM-ESM.historical.r1i1p1f1.6hrLev.hus.gn.v20191226',
'CMIP6.CMIP.AWI.AWI-ESM-1-1-LR.historical.r1i1p1f1.6hrLev.hus.gn.v20200212',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.6hrLev.ta.gn.v20190627',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.6hrLev.hus.gn.v20190826',
'CMIP6.CMIP.CMCC.CMCC-ESM2.historical.r1i1p1f1.6hrLev.ta.gn.v20210114',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.3hr.pr.gn.v20190826',
'CMIP6.CMIP.NCAR.CESM2.historical.r11i1p1f1.6hrLev.ta.gn.v20200210',
'CMIP6.CMIP.NCAR.CESM2.historical.r11i1p1f1.E1hr.pr.gn.v20190514',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.6hrLev.ta.gn.v20201112',
'CMIP6.CMIP.HAMMOZ-Consortium.MPI-ESM-1-2-HAM.historical.r1i1p1f1.6hrLev.hus.gn.v20190627',
'CMIP6.CMIP.CMCC.CMCC-CM2-SR5.historical.r1i1p1f1.6hrLev.ta.gn.v20200616',
'CMIP6.CMIP.AS-RCEC.TaiESM1.historical.r1i1p1f1.6hrLev.hus.gn.v20201112',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.6hrLev.hus.gn.v20191108',
'CMIP6.CMIP.CAS.FGOALS-g3.historical.r1i1p1f1.6hrLev.ta.gn.v20190826',
'CMIP6.CMIP.NCC.NorESM2-MM.historical.r1i1p1f1.6hrLev.ta.gn.v20191108']
Agai, thanks so much for your help on this @jbusecke!
List of requested idds
Description
6-hourly temperature and specific humidity, and either 1-hourly or (more commonly) 3-hour mean precipitation data for 21 CMIP6 models (~20 TB data). Some of the variables for some of these models are already included in the catalog; repeats are not given in the above list.
This data is needed for key figures for a paper I'm looking to publish soon.