Closed durack1 closed 2 months ago
Great thanks. Just to check, all institutes need to be registered on RoR now in order to play nice with the CMIP process?
@znichollscr that's what we are aiming for, this simplifies things a little our end, as RoR has intentions to manage a lot of info
Ok great, but there'll still be some institution ID key in https://github.com/PCMDI/input4MIPs-cmor-tables/blob/master/input4MIPs_institution_id.json#L5
Or is the plan to just check against RoR, and if it's not there, explode?
@znichollscr yep exactly. So as an example, your CMIP6-era contribution was from UoM (University of Melbourne), which is identified here. Ditto for all the other CMIP6-era contributions (CMIP6, input4MIPs, ...)
Thanks both, only just getting to this! I can't see an id for exeter on the RoR page. Do I need to directly edit the input4MIPs_CVs file? Is Exeter the right institution anyway, or could I make a more inclusive one like "CMIP7 CFTT Volcano emission team" and list multiple institutions ? Sources will be quite varied so it might be worth a discussion on how to best inform that. Apologies in advance for the trivial questions and for not engaging with this earlier.
Hi @thomasaubry, looks like U. Exeter is at https://ror.org/03yghzc09.
The way that we've worked in the past (with CMIP6) was to have an institution_id, and a source_id. The institution in this case makes sense to me to be UExeter (linked to the ROR that exists linked above), and then we need to determine a source_id which identifies the contributor(s). In CMIP6 a modelling group (e.g. NOAA-GFDL) contributed data from multiple models/source_id, e.g. GFDL-CM4, GFDL-ESM4 etc - for examples see CMIP6_source_id.html. Ideally, we want these short (<25 chars) as they will be used in the directories and filenames that the data will be published to ESGF on - for the CMIP6 template see here
@znichollscr we have a prototype netcdf file available, so let me know where I can drop this so it can be checked - or how to run the checker on the data?
@znichollscr we have a prototype netcdf file available, so let me know where I can drop this so it can be checked - or how to run the checker on the data?
Good question, not really ready for that yet unfortunately! Will ping when I am
@znichollscr I was hoping we could highlight this tool during the meeting Wednesday.. Any chance this could happen, or are we still a while away?
Folks are starting to get to a point that demo files are available, so figured trying to steer everything in one direction (eval tool) is a better idea than my old scripts
We can talk about it and the idea, but actually using it is still a couple of months away unfortunately because I have to get my own data out first 🙃
@thomasaubry just looping back around on this source_id registration - will need to ascertain what we want to call this data and which institution is the host etc before we proceed - maybe a quick telco could be quickest?
Edited comment after iteration with @durack1 @thomasaubry great start to get all the data converted into netcdf! I just had a look at the file you shared with us. Some very quick feedback:
for strat_opt_aer* file
for utsvolcsulfur_* file - this is going to be a complicated file
ping @durack1
Hi @thomasaubry, congrats from me too.
As @vnaik60 says, there are quite a few things to tweak. It might be easiest if I give you a hand getting started. I'll drop you an email to find a time.
As a quick question, is there a CMIP6 data file we should be using as a template/example? Or are we doing a fresh start this time?
Perhaps this file is our starting point/example? https://esgf-data1.llnl.gov/thredds/dodsC/css03_data/CMIP6/PAMIP/IPSL/IPSL-CM6A-LR/pdSST-pdSIC/r164i1p1f1/Eday/aod550volso4/gr/v20191124/aod550volso4_Eday_IPSL-CM6A-LR_pdSST-pdSIC_r164i1p1f1_gr_20000401-20010531.nc
Although, seems unlikely given that is a model output file :) (and the link is now dead)
@znichollscr that seems like an output file (the link did not work for me though) Here is a description of what was available for CMIP6. For optical properties Beiping provided data on each model's spectral band. The data was made available at ftp://iacftp.ethz.ch/pub_read/luo/CMIP6/ which does not work any more. I am not sure if that data ever made it to ESGF (@durack1 ?). If you would like to see what was provided for GFDL model, I can request my colleague who processed the data to dig up and share.
Oops yes thank you.
If you would like to see what was provided for GFDL model, I can request my colleague who processed the data to dig up and share
I would flip this around. If you want us to provide a file that looks the same as was provided last time, send us a file and we can see how hard it would be to replicate :) (I know that consistency with CMIP6 is important, even if I still think it pushes us in the wrong direction)
Good point :-). I will share it here once we find it.
The files you want are the IACETH-SAGE3lambda-3-0-0
files on ESGF/input4MIPs. Having said that, Beiping generated model wavelength-targeted files for each of the modelling groups, so effectively did the mapping/interpolation of the native data to match the atmospheric vertical coords - probably a good idea to schedule a telco time so we can quickly calibrate and outline next steps
Ah ok cool thanks. Those files only have aerosol information in them, nothing about injection heights etc. Is the injection height stuff new for CMIP6, or is there an example file from CMIP6 for that too?
The injection heights etc is new as the volcanic SO2 emissions is a new dataset in CMIP7. Although a dataset developed by Neely and Schmidt existed which was used at least by WACCM for CMIP6.
Nice, thank you. Was the data format they used helpful? Or is better to just put this data on a time-lat-lon-height grid, even if it is mostly nan?
It was helpful to the extent that we converted it to what could be used in our respective models (netcdf, regular time-alt-lat-lon grid and any other model related format change). We iterated with Tom earlier this year and agreed that it would be ok to include lat/lon/time only for which there is an eruption to keep the size of the file under control.
By the way, just to confirm that we are discussion two files here - 1) aerosol optical properties that were also made available in CMIP6 (@durack1 shared this in https://github.com/PCMDI/input4MIPs_CVs/issues/9#issuecomment-2226154177) and 2) volcanic SO2 emissions that were not made available in CMIP6 but were available externally (https://github.com/PCMDI/input4MIPs_CVs/issues/9#issuecomment-2226186387).
Thanks @vnaik60 for sending the previous format via email. I've put ncdumps of IACETH-SAGE3 and the file that was sent below. I have to admit that it is not clear to me at all how the two map, but I assume it will make more sense to @thomasaubry once we have a look tomorrow.
Hi everyone, just dropping a line to apologize for the silence! I had a grant deadline last Friday and was off yesterday/this morning so re-emerging...I will reply/add comments/etc today/over the week (although slowly because I took a bit of time off...repainting my flat!)
Just cross-tagging repos - awaiting action at https://github.com/PCMDI/mip-cmor-tables/issues/60
Suggested source ID entry (same idea as #42)
"UoE-CMIP-0-1-0":{
"contact":"T.Aubry@exeter.ac.uk",
"further_info_url":"www.tbd.invalid",
"institution_id":"UoE",
"license_id":"CC BY 4.0",
"mip_era":"CMIP6Plus",
"source_version":"0.1.0"
}
I was just talking to @wolfiex, he was suggesting that when we register "University of Exeter" we do this with a unique identifier, "UoE" is a little vague, and could also be an identifier for the "University of Edinburgh", "University of England" (if there was one, etc). So his recommendation was for "UoExeter" which is going to be a little hard for other places to attempt to claim.
@thomasaubry does that sound like a reasonable path to you? @wolfiex will have the mip-cmor-tables ready next week for this registration to occur, so we can get that done, alongside finalizing these volcanic datasets.. exciting!
To connect information across repos, some early data validation was completed by @shipengzhang and exists in two notebooks https://github.com/shipengzhang/evaluate_cmip_volcano/blob/main/volcano/examine_volc_prescribed.ipynb https://nbviewer.org/github/shipengzhang/evaluate_cmip_volcano/blob/main/volcano/examine_volc_prescribed.ipynb https://github.com/shipengzhang/evaluate_cmip_volcano/blob/main/volcano/read_input_emi.ipynb https://nbviewer.org/github/shipengzhang/evaluate_cmip_volcano/blob/main/volcano/read_input_emi.ipynb
It's great to have some eyes on these data, which look quite different from its CMIP6 counterpart!
@wolfiex @durack1 sounds great! I've been using "uoexeter", happy to go with or without capitals :)
The id will be lowercase, but there are no such limitations for the CMIP acronym itself. Convention suggests capitalisation with the exception of stopwords so UoExeter might be an option?
I guess the only consideration at this point is what the convention for other consortia and institutions is likely to be.
@thomasaubry for the final registration (which we should be able to get in the queue ~tomorrow, @wolfiex is finalizing some updates this week), we could go with "UoExeter", which would sync with some of the other registrations we already have e.g. "UoM", "UofMD", but you'll see these weren't really systematic, just unique. The older examples are here input4MIPs_institution_id.json - encapsulated below, whereas we are merging institutions across projects (so input4MIPs is just one MIP project, CMIP6, CMIP5, obs4MIPs etc are others, and there are often cases where the same institution contributes to more than a single project - e.g. many modeling groups contributed to CMIP3, CMIP5, CMIP6, and will contribute to CMIP7 - and keeping their institution_id consistent across these projects is the goal). Any interest in providing the volcanic forcing for CMIP8 😄
[
"CCCma",
"CNRM-Cerfacs",
"CR",
"DRES",
"IACETH",
"IAMC",
"ImperialCollege",
"MOHC",
"MPI-B",
"MPI-M",
"MRI",
"NASA-GSFC",
"NCAR",
"NCAS",
"PCMDI",
"PNNL-JGCRI",
"SOLARIS-HEPPA",
"UCI",
"UColorado",
"UReading",
"UoM",
"UofMD",
"VUA"
]
@thomasaubry just creating an issue as a placeholder for discussions in finalizing the registration of the volcanic forcing
institution_id
andsource_id
.Note the CMIP6 contribution had
institution_id
: IACETH (here) and a couple of versioned releases, sosource_id
entries: IACETH-SAGE3lambda-2-1-0 and 3-0-0 (here).We have updated the institution registration a little moving beyond CMIP6, these now depend on the RoR registry (see here), and, as an example UExeter is already registered - https://ror.org/03yghzc09
@wolfiex @matthew-mizielinski @taylor13 @vnaik60 @znichollscr ping