leap-stc / cmip6-leap-feedstock

Apache License 2.0
13 stars 5 forks source link

[REQUEST]: HighResMIP sea-ice dataset #180

Open mvichi opened 3 weeks ago

mvichi commented 3 weeks ago

List of requested idds

'CMIP6.HighResMIP.AWI.AWI-CM-1-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825',
 'CMIP6.HighResMIP.AWI.AWI-CM-1-1-LR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825',
 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200921',
 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200921',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190221',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190221',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190401',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190401',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.siconc.gn.v20181212',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.sivol.gn.v20181212',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.siconc.gn.v20190314',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.sivol.gn.v20190314',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170915',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170915',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180221',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180221',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20181119',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20181119',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180730',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180730',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170921',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170921',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170928',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170928',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200810',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200810',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200812',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200812',
 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.siconc.gn.v20210416',
 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.sivol.gn.v20210416',
 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180701',
 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180701',

Description

Hi guys, thanks a lot for your effort and for continuously improving the system We recently run an analysis on the HighResMIP output to assess the performance of sea ice simulations in the northern and southern hemisphere. The work was done with the "download model" and it was published in the two papers below (Selivanova et al., 2024a,b). An MSc student at the University of Cape Town is currently adapting the SItool (Lin et al., 2021) to work with Pangeo, and we would also like to add the assessment of the HighResMIP. The student is currently testing the system with the low-res CMIP6 models, and it would be great to add the HighResMIP. The thesis should be submitted in February 2025, but the analysis should be ideally completed before the end of 2024. Thanks in advance, Marcello

Lin, X., Massonnet, F., Fichefet, T., Vancoppenolle, M., 2021. SITool (v1.0) – a new evaluation tool for large-scale sea ice simulations: application to CMIP6 OMIP. Geoscientific Model Development 14, 6331–6354. https://doi.org/10.5194/gmd-14-6331-2021 Selivanova, J., Iovino, D., Cocetta, F., 2024a. Past and future of the Arctic sea ice in High-Resolution Model Intercomparison Project (HighResMIP) climate models. The Cryosphere 18, 2739–2763. https://doi.org/10.5194/tc-18-2739-2024 Selivanova, J., Iovino, D., Vichi, M., 2024b. Limited Benefits of Increased Spatial Resolution for Sea Ice in HighResMIP Simulations. Geophysical Research Letters 51, e2023GL107969. https://doi.org/10.1029/2023GL107969

jbusecke commented 3 weeks ago

Hi @mvichi thanks for using the cloud data. I just started https://github.com/leap-stc/cmip6-leap-feedstock/pull/181 as a test and will run the full thing as soon as the PR succeeds! I am very busy this week, but this should squeeze in between other tasks and is related to my work all week. So please feel free to ping me here or via email (julius@ldeo.columbia.edu) in the likely case that I forget to move on this. I am motivated to get as much data up as possible for your deadline.

jbusecke commented 3 weeks ago

Seems like we are getting only 6 datasets from the ESGF API right now. Is that useful to ingest already? Happy to rerun things a few times and hope for better availability! EDIT: This was my bad. I did not allow all member_id s. Lets see how many we get now!

jbusecke commented 3 weeks ago

Ok this looks better:

'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r3i1p2f1.SImon.sivol.gn.v20190215',
'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.siconc.gn.v20190314',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i3p1f1.SImon.siconc.gn.v20190710',
'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190401',
'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.sivol.gn.v20190314',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r6i1p1f1.SImon.siconc.gn.v20181119',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r8i1p1f1.SImon.siconc.gn.v20190425',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i5p1f1.SImon.siconc.gn.v20190418',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170915',
'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r2i1p1f2.SImon.sivol.gn.v20200615',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r3i1p1f1.SImon.siconc.gn.v20181119',
'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i3p1f1.SImon.siconc.gn.v20190710',
'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r3i1p1f2.SImon.sivol.gn.v20200615',
'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r3i1p1f1.SImon.siconc.gn.v20181119',
'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r3i1p1f2.SImon.siconc.gn.v20200224',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i2p1f1.SImon.siconc.gn.v20190710',
'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r2i1p2f1.SImon.siconc.gn.v20190812',
'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r2i1p2f1.SImon.sivol.gn.v20190812',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i2p1f1.SImon.sivol.gn.v20190710'

Seem to be available right now. Not all you requested, but Ill try to run these now and we can rerun later.

jbusecke commented 3 weeks ago

Seeing a few errors for unavailable files (hopefully these resolve over time), but also a bunch of successful jobs already. Ill check in in a bit and give you a report for now.

jbusecke commented 3 weeks ago

Ok will need to change gear and work on something else for now, but lets continue here soon.

So I followed the instructions to check which datasets were uploaded here and got:

Found in catalog='qc': iids=['CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.sivol.gn.v20210416', 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917', 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190401', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180730', 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.siconc.gn.v20190314', 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.sivol.gn.v20181212', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180730', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170921', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170928', 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917', 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.siconc.gn.v20181212', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170921', 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917', 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190401', 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190221', 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190221', 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180701', 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180701']

Found in catalog='non-qc': iids=['CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20181119', 'CMIP6.HighResMIP.AWI.AWI-CM-1-1-LR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825', 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606', 'CMIP6.HighResMIP.AWI.AWI-CM-1-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170915', 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20181119', 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180221', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180221', 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606']

Found in catalog='retracted': iids=[]

Still missing 11 of 40: 
missing_iids=['CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200810', 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200810', 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.sivol.gn.v20190314', 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200921', 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.siconc.gn.v20210416', 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170928', 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200812', 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200921', 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170915', 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200812', 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917']

Seems like we got 10+ uploaded and tested! There are quite a few that fail our tests (the non-qc) catalog. If you or the student have some time to look into what might be wrong with those datasets (follow the instructions here to access the non-qc datasets) that would be very helpful. Perhaps we can fix the issues. For the 11 ones that are still missing, I would recommend that we rerun the ingestion a couple of times and see if this is just due to flaky data nodes.

mvichi commented 3 weeks ago

Thank you, Julius, that was incredibly quick. I am travelling right now, and I'll be back to work next week. We'll report back on their status and quality asap. We appreciate very much your prompt reaction!

jbusecke commented 3 weeks ago

Running the pipeline once again just to see if we catch some more. Getting closer:

Still missing 4 of 40: 
missing_iids=['CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200810', 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200921', 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200812', 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200921'
jbusecke commented 1 week ago

Running again to see if we can get the last hold outs to ingest. @mvichi did you get a chance to test the newly ingested data?

mvichi commented 4 days ago

We tested all the available data, and most of them work, thanks! Some of them fail the xmip preprocessing and some other crashes for other reasons, but the data integrity seems good. Thank you very much again for adding the data so quickly. We will make it available as a cookbook once completed. I'll share it through discourse, so that you can decide

jbusecke commented 2 days ago

Awesome. If you could raise issues over at xMIP I can take a look at what is going on once some time frees up!

jbusecke commented 2 days ago

I am also still seeing 2 missing datasets on my end:

import intake

def zstore_to_iid(zstore: str):
    # this is a bit whacky to account for the different way of storing old/new stores
    iid =  '.'.join(zstore.replace('gs://','').replace('.zarr','').replace('.','/').split('/')[-11:-1])
    if not iid.startswith('CMIP6'):
        iid =  '.'.join(zstore.replace('gs://','').replace('.zarr','').replace('.','/').split('/')[-10:])
    return iid

def search_iids(col_url:str):
    col = intake.open_esm_datastore(col_url)
    iids_all= [zstore_to_iid(z) for z in col.df['zstore'].tolist()]
    return [iid for iid in iids_all if iid in iids_requested]

iids_requested = [
'CMIP6.HighResMIP.AWI.AWI-CM-1-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825',
 'CMIP6.HighResMIP.AWI.AWI-CM-1-1-LR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20170825',
 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200921',
 'CMIP6.HighResMIP.BCC.BCC-CSM2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200921',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-HR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200917',
 'CMIP6.HighResMIP.CMCC.CMCC-CM2-VHR4.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200917',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190221',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1-HR.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190221',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.siconc.gn.v20190401',
 'CMIP6.HighResMIP.CNRM-CERFACS.CNRM-CM6-1.hist-1950.r1i1p1f2.SImon.sivol.gn.v20190401',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.siconc.gn.v20181212',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P-HR.hist-1950.r1i1p2f1.SImon.sivol.gn.v20181212',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.siconc.gn.v20190314',
 'CMIP6.HighResMIP.EC-Earth-Consortium.EC-Earth3P.hist-1950.r1i1p2f1.SImon.sivol.gn.v20190314',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170915',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170915',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180221',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180221',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20181119',
 'CMIP6.HighResMIP.ECMWF.ECMWF-IFS-MR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20181119',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180730',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180730',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170921',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-LL.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170921',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.siconc.gn.v20170928',
 'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-MM.hist-1950.r1i1p1f1.SImon.sivol.gn.v20170928',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180606',
 'CMIP6.HighResMIP.MPI-M.MPI-ESM1-2-XR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180606',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200810',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-HR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200810',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.siconc.gn.v20200812',
 'CMIP6.HighResMIP.NCAR.CESM1-CAM5-SE-LR.hist-1950.r1i1p1f1.SImon.sivol.gn.v20200812',
 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.siconc.gn.v20210416',
 'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.SImon.sivol.gn.v20210416',
 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.siconc.gn.v20180701',
 'CMIP6.HighResMIP.NOAA-GFDL.GFDL-CM4C192.hist-1950.r1i1p1f1.SImon.sivol.gn.v20180701',
]

url_dict = {
    'qc':"https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog.json",
    'non-qc':"https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog_noqc.json",
    'retracted':"https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog_retracted.json"
}

iids_found = []
for catalog,url in url_dict.items():
    iids = search_iids(url)
    iids_found.extend(iids)
    print(f"Found in {catalog=}: {iids=}\n")

missing_iids = list(set(iids_requested) - set(iids_found))
print(f"\n\nStill missing {len(missing_iids)} of {len(iids_requested)}: \n{missing_iids=}")

So ill leave this open for now, unless you think we can close this.

In any case, please make sure to cite the original CMIP6 data sources and if you could acknowledge our efforts here (https://zenodo.org/badge/latestdoi/618127503) too that would help a lot. Cheers.