ror-community / ror-updates

Central tracking for updates to ROR
49 stars 7 forks source link

Project: WCRP-CMIP Organizations #5998

Closed adambuttrick closed 7 months ago

adambuttrick commented 8 months ago

Lawrence Livermore National Laboratory has identified a set of organizations participating in the World Climate Research Program Coupled Model Intercomparison Project (WCRP-CMIP) that they would like reviewed and added to the registry, where appropriate. The list of organization is available at the below:

https://github.com/WCRP-CMIP/CMIP6_CVs/blob/master/CMIP6_institution_id.json

durack1 commented 8 months ago

Just copying emailed content into this repo/issue.

Wow, nice Adam, glad we’re cooking with gas and already a fair way down the path to being ROR’d!

I had a some queries.

First up aliases. We’ve done a far cleaner job identifying institutions (institution_id’s) in CMIP6 than in the previous phases, such that the ids e.g., “NOAA-GFDL” are being used consistently across all downstream materials (e.g, IPCC AR6 Annex II table AII.5 here). In previous phases however, the exact same institution (location, funding, etc) was identified differently, such that “GFDL”, “NOAA Geophysical Fluid Dynamics Laboratory”, “U.S. Department of Commerce/National Oceanic and Atmospheric Administration (NOAA)/Geophysical Fluid Dynamics Laboratory (GFDL), USA” (e.g., AR4 here), are all identifying the same institution.

How can we add all these aliases into the ROR entry?

Second, groups/faculties/divisions within an ROR-identified org. You’ve already flagged consortia that exist in the CMIP6_CVs (e.g., E3SM-Project, EC-Earth-Consortium, etc). We also have UCI registered, which expands to “Department of Earth System Science, University of California Irvine, Irvine, CA 92697, USA” (here), but we also have UCI-CHRS (here) which is another center/division/faculty within the same org.

How do we clearly identify each of these when “UCI” is the identifier that we’ve been using?

And third, changing consortia through time. A good example is the CSIRO (Australian Commonwealth Scientific Industrial Research Organization) contribution through time. In CMIP6, this is “CSIRO”, along with some university collaborations “CSIRO-ARCCSS” (ARC Centre of Excellence in Climate System Science – a consortium of universities, now wound down), and “CSIRO-COSIMA” (Consortium of Ocean-Sea Ice Modelling in Australia – another university consortium, mainly ANU). In CMIP5, this was “CSIRO-BOM” (joint with the Australian Bureau of Meteorology) and “CSIRO-QCCCE” (Queensland Climate Change Centre of Excellence, now wound down), and CMIP3/2 etc this was “CSIRO”.

How did you plan to manage such evolution in the ROR registries?

We will be unable to rewrite the data that have been generated with these entries back in time, but would like to figure out a way of moving forward while recognizing contributions for some existing and closed centres.

@taylor13 @wolfiex ping

adambuttrick commented 8 months ago

@carlyrobinson This request includes two DOE projects, E3SM-Project and RUBISCO. Can you speak to whether these should be discretely represented in ROR or identified by the IDs of their participants?

carlyrobinson commented 8 months ago

@adambuttrick these may be good to discuss with the larger group. I'm not familiar with either of these. They seem to be both models and projects. I'm not sure if we've seen a similar use case. It might be helpful to understand if these are being used as affiliations or what other ROR use case is being implemented/planned.

durack1 commented 8 months ago

thanks @carlyrobinson. Both are DOE funded climate research projects. The first (E3SM) is a climate modeling effort, with contributions across a broad range of DOE labs (e.g., here), and the second is a DOE ORNL project that has links to other labs and universities (e.g., here)

arthurpsmith commented 8 months ago

I think we need to push back on this - most of these are already in ROR. Some of them are combinations of two (or more?) ROR institutions - CNRM-CERFACS for example, we're not going to create a new ROR id for that pairing are we? They need to model it on their end as a combination of two entities. Similarly for CSIRO-ARCCSS etc. If there are organizations in this list that are not in ROR and not some combination of existing ROR entities then please identify them, I don't think that should be ROR's job.

adambuttrick commented 8 months ago

@arthurpsmith Thanks for reviewing! Yes, I've already flagged for them having internal records that join the entities. I linked this issue as context in our notes for discussing #5821 and the DOE projects/projects more generally.

arthurpsmith commented 8 months ago

Ah, sorry, I see from the referenced github issues above that a lot of these are in ROR now because they were added thanks to this request. Wasn't clear on what the issue for discussion was here but understood now. Thanks.

carlyrobinson commented 8 months ago

A bit of information about DOE and our support for persistent identifiers for models. DOE-funded research outputs (e.g., data, software, models) should be reported to DOE's Office of Scientific and Technical Information (OSTI). Research outputs are provided to OSTI by either DOE labs or by financial assistance recipients/grantees. They are submitted primarily using our ingest system E-Link, though DOE-funded software should be submitted through DOE CODE.

Through both the data and software records submission workflows, OSTI assigns DOIs to these research outputs. Models have previously been summited to OSTI as both data and software, depending on how the submitter/community thinks about the model and how the output is structured.

Here are a few examples (all of which have DOE assigned DOIs): Software: https://www.osti.gov/doecode/biblio/115054, https://www.osti.gov/doecode/biblio/73414 Data: https://www.osti.gov/dataexplorer/biblio/dataset/1819956, https://www.osti.gov/dataexplorer/biblio/dataset/1579361

OSTI already has many research outputs associated with E3SM - https://www.osti.gov/search/term:%22E3SM%22.

durack1 commented 8 months ago

I think we need to push back on this - most of these are already in ROR. Some of them are combinations of two (or more?) ROR institutions - CNRM-CERFACS for example, we're not going to create a new ROR id for that pairing are we? They need to model it on their end as a combination of two entities. Similarly for CSIRO-ARCCSS etc. If there are organizations in this list that are not in ROR and not some combination of existing ROR entities then please identify them, I don't think that should be ROR's job.

@arthurpsmith just a general reply. For some of these entities, back in the past (~1990 for the first IPCC/AMIP/CMIP phase) they may have been a single org, whereas, circa 2023, they are now defined as two entities (in the case of CNRM and CERFACS). So there will need to be a way to manage these evolutions, splitting, joining, losing half their consortium etc as time evolves.

arthurpsmith commented 8 months ago

@durack1 yes that's not an unusual situation - about a year ago ROR added a "predecessor/successor" relationship (see https://ror.readme.io/changelog/2022-12-01-organization-status-changes) and the "inactive" status to handle merge/split changes or other types of fundamental alterations in an organization's structure.