Closed eguil closed 8 years ago
@eguil are there any amendments to the standing source_id
(model) template that should be augmented so we have information in one place - take a look here
The experiment_id
template should also be reviewed - take a look here or for a web-based tabulated version, here
Please add any other ES-DOC contributors that you think should be involved in discussing this issue using their github @handle
@taylor13 pinging you here
I can see problems with the institute_id in the model description. Is this a list of institutes that contributed to the model (in which case we are going to need to expand the CMIP6 institute codes massively), or is it the institute that is running the model,or the institute funding the model development, or the institute that will be the point of contact for questions about the model.
I had understood that the citation system would handle this complexity.
In my view we should not have hierarchical dependent CVs. There is a model (source), an institute and they are separate entities which are joined together when describing a data set. So UKMO uses HadGEM3-GC3 to produce a simulation for the SSP2.4 experiment
Re source_id: A homepage attribute might be useful;
Re experiment_id: Some info is not yet incorporated into the ES-DOC viewer - I will rectify this.
The sub-model details should follow the realm classification. It is mostly OK - but surely "glacier" should be "land_ice" to be consistent with other parts of the CMIP6 infrastructure. What about "aerosols"?
The model component codes used in the experiment_id template does not match the codes used in ES-DOC. Some codes used are not even component types - AOGCM is not a component type, its a model type ... and at the moment I don't think it is in the list of model types in ES-DOC.
If these templates go out to end users it is going to unnecessarily confusing for modelling groups having to use two different conventions for providing metadata for CMIP6. Can't be that hard to align these two sets of codes.
Apologies for being a bit tetchy about this - but I have been banging on about it for some time; and I'm one of the people who has to deal with the confusion it creates at a modelling centre. To give you just one example - how internal metadata systems will need to maintain multiple enumeration lists for the same thing and know where to apply each list ;-(
Regarding: institute_id in the model description. The citation currently uses the CV to provide the entries for the citation GUI. I interpreted the institute_id in the source_id CV as institutes running a model. It is still not clear to me, where I will get access to the connections between which institute wants to run which models and which MIPs/experiments. I can get along without the connection to the experiments, though it would be more complicated but the connection between institute and model is essential as it is related to data ownership/responsible institution for data and data citation.
Martina
I agree – model, institute and experiment are separate things – when they are all brought together then we have the core information for a data citation. If the model and institute are related then we possibly have a model citation. I don’t understand why institute is recorded in the model CV.
In out case it would need to deal with two institutes responsible for funding its core development, multiple institutes using it for running simulations, multiple institutes funding local extensions and multiple institutions contributing to the science.
Mark
From: Martina Stockhause [mailto:notifications@github.com] Sent: 08 September 2016 09:53 To: WCRP-CMIP/CMIP6_CVs Cc: Elkington, Mark; Comment Subject: Re: [WCRP-CMIP/CMIP6_CVs] Coordinate with ES-DOC model and experiment CVs (#48)
Regarding: institute_id in the model description. The citation currently uses the CV to provide the entries for the citation GUI. I interpreted the institute_id in the source_id CV as institutes running a model. It is still not clear to me, where I will get access to the connections between which institute wants to run which models and which MIPs/experiments. I can get along without the connection to the experiments, though it would be more complicated but the connection between institute and model is essential as it is related to data ownership/responsible institution for data and data citation.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/WCRP-CMIP/CMIP6_CVs/issues/48#issuecomment-245534764, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFTIJw2lt06yTk-uoMeK30vGOwyu3R1Qks5qn8zogaJpZM4J3SJM.
I agree that the CMIP6 CV that Karl and Paul are collecting from the modelling groups should only be about the model (_sourceid) and the institute, i.e. _insitutionid (running the model, i.e providing the data). This means only pairs of CV: "HadGEM3, UKMO" have to be provided by groups. The _experimentid is indeed suggested by the MIP chairs (and approved by the CMIP panel/WIP) and the model detail collected via ES-DOC. See my comments inline in https://docs.google.com/document/d/1HyKbkftWPnGkSPZC6I6nd58dRs2wxnyyJnIQFouVYRk/edit?ts=57d04716
Just one more thought on the CVs (btw. I do not have access to the google doc Eric cited): We need to make clear, what the CVs are for. In my opinion, their core function is to be a reference for DRS names for the CMIP6 infrastructure components including QC.
As modeling centers should register their contribution for CMIP6, it would be possible to capture the connections among institute_id, source_id and experiment_id, which would be extremely helpful at least for citation.
However, I do not think that it is a good idea to collect too many details about the model (DRS component source_id) within the CV. These details are collected by CIM/ES-DOC. In my view the CV is the reference and ES-DOC needs to synchronize its model information with that provided in the CV for source_id.
Hi Martina, I fully agree. The connection between the experiment description document (CIM) and the model description documents (CIM) will be made at the ESGF data publication stage (which will harvest institute_id, source_id and experiment_id from the global attributes) via the ES-DOC automated scripting currently tested.
To clarify, what we (Paul and I) aim to achieve with the CMIP6_CVs is to:
The aim is not to provide comprehensive documentation of either the experiments or the models - this is the clearly defined role of ES-DOC.
We agree that a procedure should be established to ensure that the information contained in the CMIP6_CVs *.json files propagates to all the other software supporting CMIP6, and that this be made clear before contacting the modeling centers requesting that they “register” key information about their model(s) and institution. The CV information should definetly be synchronized between ES-DOCs and CMIP6_CVs. @markelkington we completely agree with you that correct information should be obtained once, be consistent, and be reused.
Here’s how we expect the CVs to be used:
We note that simple CVs (i.e., simple lists of acceptable text strings) cannot capture all the information needed to meet the needs of above (most notably the QC implementation needed for steps 2 and 4). For QC we need to specify the allowed combinations of CV values. This is why the CVs in this repo have been structured the way they are.
We would also note that we expect ES-DOC will continually update its own data base (sim documents?) by ingesting new models and institutions as they are registered. “Stub” “landing pages” could then be immediately generated for each new simulation, providing a “live” target pointed to by ESGF publication data sets (via the further_info_url). Initially, of course, the landing page might only repeat model, institution, and experiment information already recorded in the netCDF output files themselves, but the absence of more complete documentation (requested by ES-DOC) would be obvious and might prompt modelling groups to supply the additional details needed in a more timely manner than in previous CMIP phases.
In general, the virtue of having modeling groups provide input via github is that it is clearly visible, transparent and easily updated, or corrected if necessary.
It might be useful to include here an email I sent to the WIP members, Charlotte Pascoe, and a few others on 7/15/16: This email provides a little more background and description of the CVs hosted on this github repo:
Dear all,
Paul Durack and I have now created JSON files defining controlled vocabularies (CVs) that are essential to ESGF, CMOR, the data request, and ESDOC. These files are located on github (https://github.com/WCRP-CMIP/CMIP6_CVs) as called for by one of our draft position papers ( https://docs.google.com/document/d/1CzTUoX4H2S0XbQUM3_9yKvJ2la7qUExFV7ibGzThmhA/edit ). These are not finalized, but they serve as the "reference" for certain CMIP6 CVs.
I would note, that Martin is responsible for the variable CV information, and I think a subset of the information contained in the variable request (see the document referenced below) should also be made available on the github site (namely the information essential to ESGF).
The CVs stored here will not comprehensively meet the needs of CMIP6, but they provide the foundation for:
DRS
ESGF
the data request
ESDOC
CMOR and the CMIP6 validator
Some of these CVs are pretty well agreed upon, while others might still evolve. It might be appropriate to include additional CVs on this github repository. Please advise.
I've also prepared a document that describes the vocabularies (https://docs.google.com/document/d/1N0pLdUA7_lgmK93MIQtdSeelHWPodJYOcWhDFDHiQ90), and I have shared it so you can comment and suggest changes.
Note that the CVs don't provide _everything_ one might want to know. Additional descriptions, specifications, relationships, documentation, etc. will be recorded and made available from:
the data request
ESDOC
CMOR tables
Also the CMOR tables, ESDOC, and the data request database developed by Martin will duplicate some of the CV information stored on the github site, but those will be derivative and not the reference. It will be important that all three of those remain synchronized with the JSON files on github.
Please look over both the CVs on github and the "CV_responsibilities" document on google docs (url's for both given above), and provide comments (registered as issues on the github or as comments/suggestions on google docs, as appropriate).
There are known inaccuracies with all the CVs currently on github, so some of your suggestions we may already be aware of. Please be patient.
If you reply to this email (as opposed to submitting issues), please "reply all".
thanks, Karl
@markelkington we agree that "land_ice" should replace "glacier". In general if ES-DOC has established a CV for the component models (including aerosols?), we should adopt it. Can someone provide the list of components as a separate issue on this repo (include "source_id" as the first word of your title)?
Also, concerning the difference between "source_type" (a separate CV) and the list of possible component models included in the source_id CV. The WIP agreed some time ago that source_id would not change even if the "components" comprising it were turned on or off. For example, a coupled model run in AMIP mode (i.e., only the atmospheric and land components active), would have the same name as when it was run in coupled mode (as an AOGCM). The experiment of course implies a certain model configuration, but there remains some flexibility (e.g., running with atmospheric chemistry on or off). If two versions of a model were to run the same experiment, then we distinguish between them using the "p" index of the "ripf" indicator. Also, the global attribute "source_type" would be different for the two runs. The options proposed for "source_type" are listed in note 13 after table 1 in the WIP's global attribute document: https://docs.google.com/document/d/1h0r8RZr_f3-8egBMMh7aqLwy3snpD6_MrDz1q8n5XUk/edit#
If the list of source_types is inconsistent with ES-DOCs, please raise an issue pointing this out. Note that "source_type" is not meant to be a simple list of component models. Rather it distinguishes among different categories of models, which will make it more useful in ESGF searches. If anyone can think of a much better way of doing this, please propose it immediately. We want to finalize the global attributes document this weekend.
The list of component models appearing in the source_id should be comprehensive in the sense that if in any CMIP6 experiment a component is included, then it should be listed even if for other experiments the component is inactive (e.g., both the atmospheric and ocean components should be specified if the model runs DECK experiments even though in the AMIP run, the ocean is turned off).
@markelkington Just to confirm what others have suggested: For QC it is important to check whether the institution/source (i.e., model) pairs have been registered. In the source_id CV we therefore list the institution_ids of all institutions who have indicated they plan to contribute CMIP6 simulations generated by a given model. In most cases there will be only a single institution_id listed. Note that there is a separate institution_id CV where the full name and address associated with each institution_id are provided.
Similarly, in the experiment_id CV we plan to modify the structure slightly as discussed in https://github.com/WCRP-CMIP/CMIP6_CVs/issues/1 . The plan is to remove "sub_experiment" from this CV, and only include the list of possible "sub_experiment_ids". Then we will create a new CV called "CMIP6_sub_experiment_id.json" which will be a dictionary with "sub_experiment_id" as the key and "sub_experiment" the value associated with each key.
Hi Karl
Re: land-ice. The CMOR and ES-DOC lists are subtly different. ES-DOC references scientific realms (atmosphere, aerosols, land, land-ice, etc.). Component is usually used to refer to a physical model component – which may implement one or more realms e.g. nemo. It will be possible to use one CV for both CMOR and ESDOC as long as we agree that we agree the component in CMOR is equivalent to scientific realm.
Re: source-type – do you want me to raise the issue in github for CMOR. I raised it in the ES-DOC repository some months ago. I don’t really mind which list is used (or even a combination of the list values) as long as there is just one list and it is agreed who maintains the content of that list. [Regarding the issue of using AOGCM as the value when we are running in atmosphere only mode – that seems OK to me]
Re: source-id – my interpretation of your explanation is that we will have one component list for each model – and it will be the full set of components that we use in the model even if some are turned off for a particular MIP/experiment (and the source-type represent this full configuration model). Is that correct?
Regards
Mark
From: taylor13 [mailto:notifications@github.com] Sent: 09 September 2016 16:22 To: WCRP-CMIP/CMIP6_CVs Cc: Elkington, Mark; Mention Subject: Re: [WCRP-CMIP/CMIP6_CVs] Coordinate with ES-DOC model and experiment CVs (#48)
@markelkingtonhttps://github.com/markelkington we agree that "land_ice" should replace "glacier". In general if ES-DOC has established a CV for the component models (including aerosols?), we should adopt it. Can someone provide the list of components as a separate issue on this repo (include "source_id" as the first word of your title)?
Also, concerning the difference between "source_type" (a separate CV) and the list of possible component models included in the source_id CV. The WIP agreed some time ago that source_id would not change even if the "components" comprising it were turned on or off. For example, a coupled model run in AMIP mode (i.e., only the atmospheric and land components active), would have the same name as when it was run in coupled mode (as an AOGCM). The experiment of course implies a certain model configuration, but there remains some flexibility (e.g., running with atmospheric chemistry on or off). If two versions of a model were to run the same experiment, then we distinguish between them using the "p" index of the "ripf" indicator. Also, the global attribute "source_type" would be different for the two runs. The options proposed for "source_type" are listed in note 13 after table 1 in the WIP's global attribute document: https://docs.google.com/document/d/1h0r8RZr_f3-8egBMMh7aqLwy3snpD6_MrDz1q8n5XUk/edit#https://docs.google.com/document/d/1h0r8RZr_f3-8egBMMh7aqLwy3snpD6_MrDz1q8n5XUk/edit
If the list of source_types is inconsistent with ES-DOCs, please raise an issue pointing this out. Note that "source_type" is not meant to be a simple list of component models. Rather it distinguishes among different categories of models, which will make it more useful in ESGF searches. If anyone can think of a much better way of doing this, please propose it immediately. We want to finalize the global attributes document this weekend.
The list of component models appearing in the source_id should be comprehensive in the sense that if in any CMIP6 experiment a component is included, then it should be listed even if for other experiments the component is inactive (e.g., both the atmospheric and ocean components should be specified if the model runs DECK experiments even though in the AMIP run, the ocean is turned off).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/WCRP-CMIP/CMIP6_CVs/issues/48#issuecomment-245945363, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFTIJ7F5-zNbGqT9snvzIxao7RkWbLvqks5qoXmXgaJpZM4J3SJM.
Karl
Agree with both of those points
Mark
From: taylor13 [mailto:notifications@github.com] Sent: 09 September 2016 16:41 To: WCRP-CMIP/CMIP6_CVs Cc: Elkington, Mark; Mention Subject: Re: [WCRP-CMIP/CMIP6_CVs] Coordinate with ES-DOC model and experiment CVs (#48)
@markelkingtonhttps://github.com/markelkington Just to confirm what others have suggested: For QC it is important to check whether the institution/source (i.e., model) pairs have been registered. In the source_id CV we therefore list the institution_ids of all institutions who have indicated they plan to contribute CMIP6 simulations generated by a given model. In most cases there will be only a single institution_id listed. Note that there is a separate institution_id CV where the full name and address associated with each institution_id are provided.
Similarly, in the experiment_id CV we plan to modify the structure slightly as discussed in #1https://github.com/WCRP-CMIP/CMIP6_CVs/issues/1 . The plan is to remove "sub_experiment" from this CV, and only include the list of possible "sub_experiment_ids". Then we will create a new CV called "CMIP6_sub_experiment_id.json" which will be a dictionary with "sub_experiment_id" as the key and "sub_experiment" the value associated with each key.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/WCRP-CMIP/CMIP6_CVs/issues/48#issuecomment-245951422, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFTIJ8gaqYTIYc-SdCPr2KenZjTHP0U4ks5qoX4hgaJpZM4J3SJM.
@markelkington
Hi Mark,
Re: your paragraph labeled "Re: land-ice": Thanks for explaining how things are done in ES-DOC and providing the nemo model example. That has me thinking we should perhaps back off on trying to record quite so much information about the model components in the “source” global attribute. We were thinking that as in CMIP5 we might want to include in “source”: the name of the full model, the vintage, and the names of the component models. It might be confusing to list “memo” as both the ocean model and the sea ice model. Perhaps “source” should just record 1) a complete and precise identifying label for the model (as it normally would be documented by the modeling group, and 2) model vintage (i.e., year the model was first used in a scientific application). For example:
source_id = “GFDL-CM2-1” source = “GFDL CM2.1: cycle 2.1.14 (2012)”
(Note there would be no restriction on "source" to remove forbidden characters like blanks and parentheses.]
Another option is something along the lines of CMIP5. For example:
source_id = “CCSM2” source = “CCSM2 (2002) atmosphere: CAM2 (cam2_0_brnchT_itea_2, T42L26); ocean: POP (pop2_0_ver_1.4.3, 2x3L15); sea ice: CSIM4; land: CLM2.0”
What do you and others think? thanks, Karl
@markelkington Hi Mark,
Re your paragraph labeled "Re: source_id" -- yes, the original intention was to include all components, whether active or not. [This assumes we still plan to list the components in the source_id CV.] Karl
@eguil @markelkington @MartinaSt @momipsl as @taylor13 noted, it is best that we attempt to align our (controlled) vocabularies between CMIP6_CVs and ES-DOC. In particular the experiment_id
and source_id
.
Do you folks have a website/repo that lists the ES-DOC vocabs?
As you already have placeholder descriptive pages for the experiment_id
, would it be useful that we include an ES-DOC_url
or equivalent entry in each experiment entry so that it's clear where the detailed documentation can be obtained?
So for an example, the standing experiment_id
entry for 1pctCO2
becomes:
"1pctCO2":{
"activity_id":[
"CMIP"
],
"additional_allowed_model_components":[
"AER",
"CHEM",
"BGM"
],
"description":"DECK: 1pctCO2",
"ES-DOC_url":"http://view.es-doc.org/?renderMethod=id&project=cmip6-draft&id=18200dd7-c51e-4a23-9485-9a86ffc13dd5"
"end_year":"",
"experiment":"1 percent per year increase in CO2",
"min_number_yrs_per_sim":"150",
"parent_activity_id":[
"CMIP"
],
"parent_experiment_id":[
"piControl"
],
"required_model_components":[
"AOGCM"
],
"start_year":"",
"sub_experiment":"none",
"sub_experiment_id":"none",
"tier":"1"
},
If this was useful, we could also do a similar thing for the activity_id
, so expand the current list to be a dictionary with a similar ES-DOC_url
entry for each, and as @taylor13 noted above we could add the placeholder ES-DOC_url
to the source_id
once the entry/page has been generated on the ES-DOC site.
Regarding the source_id
question that @taylor13 has noted above, my preference would be to include all the basic identifying information that we currently have in the placeholder ACCESS-1-0
example - with the glacier
->land_ice
, and addition of aerosols
(thanks @markelkington) and any additional vocab tweaks to maintain consistency - The addition of a ES-DOC_url
would also best integrate the information across these systems
@markelkington Hi Mark, regarding your comment: "Re: source-type – do you want me to raise the issue in github for CMOR. I raised it in the ES-DOC repository some months ago. I don’t really mind which list is used (or even a combination of the list values) as long as there is just one list and it is agreed who maintains the content of that list. [Regarding the issue of using AOGCM as the value when we are running in atmosphere only mode – that seems OK to me]"
Re your last sentence: The current specification (in the WIP global attribute referred to above) says that a model performing an AMIP run should have the same source_id as when it is run in coupled mode (e.g., HadGEM1), but that source_type should be "AGCM" for AMIP and "AOGCM BCM" for a concentration-driven coupled model run (e.g., "historical") that includes a biogeochemical component model. (In this case neither should be "AOGCM".)
To make sure the vocabularies used by ES-DOC and by "source_type" are consistent, yes please point us to where the es-doc vocabulary is defined by raising an issue in this thread (not on the CMOR github repo).
thanks,
Karl
Hi Eric,
Regarding the experiment_id CV, this information was obtained from the MIP co-chairs, first by Martin, then updated extensively by me over the last year or so. It is essentially finalized and the experiment_ids are in almost all cases consistent with what is found in the GMD experiment description papers. Charlotte at one point obtained a copy of the excel spread sheet from me, but I’m not sure whether she has altered the information in any way.
The most up-to-date experiment_id
information is found here or for a web-based tabulated version, here
We have translated the critical experiment information from the original spreadsheet into the CMIP6_experiment_id.json file in the CMIP6_CVs repo. We now regard this .json file as the “reference” for the most important experiment information. ES-DOCs can import this information as needed. If there are any errors in the CV, please raise an issue on this repo. CMOR and ESGF will also rely on this reference CV for experiment_id. If additional information about the experiments should be included, please let us know immediately. What’s in there now is sufficient for CMOR and ESGF (but note the issue raised about sub_experiments, which we will be addressing shortly; see https://github.com/WCRP-CMIP/CMIP6_CVs/issues/1).
thanks, Karl
Karl
I’d vote for the CMIP5 approach. It think it gives useful information to end users.
Mark
From: taylor13 [mailto:notifications@github.com] Sent: 09 September 2016 17:45 To: WCRP-CMIP/CMIP6_CVs Cc: Elkington, Mark; Mention Subject: Re: [WCRP-CMIP/CMIP6_CVs] Coordinate with ES-DOC model and experiment CVs (#48)
@markelkingtonhttps://github.com/markelkington
Hi Mark,
Re: your paragraph labeled "Re: land-ice": Thanks for explaining how things are done in ES-DOC and providing the nemo model example. That has me thinking we should perhaps back off on trying to record quite so much information about the model components in the “source” global attribute. We were thinking that as in CMIP5 we might want to include in “source”: the name of the full model, the vintage, and the names of the component models. It might be confusing to list “memo” as both the ocean model and the sea ice model. Perhaps “source” should just record 1) a complete and precise identifying label for the model (as it normally would be documented by the modeling group, and 2) model vintage (i.e., year the model was first used in a scientific application). For example:
source_id = “GFDL-CM2-1” source = “GFDL CM2.1: cycle 2.1.14 (2012)”
(Note there would be no restriction on "source" to remove forbidden characters like blanks and parentheses.]
Another option is something along the lines of CMIP5. For example:
source_id = “CCSM2” source = “CCSM2 (2002) atmosphere: CAM2 (cam2_0_brnchT_itea_2, T42L26); ocean: POP (pop2_0_ver_1.4.3, 2x3L15); sea ice: CSIM4; land: CLM2.0”
What do you and others think? thanks, Karl
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/WCRP-CMIP/CMIP6_CVs/issues/48#issuecomment-245969790, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFTIJ8zy6TC1EHm4iELIr6oE-eH7RhQ0ks5qoYz3gaJpZM4J3SJM.
@durack1: I would strongly advise against embedding ES-DOC url's directly into the vocabs, i.e. ES-DOC syncs with the vocabs not the other way round. Any syncing issues will be raised as tickets on this repo.
Dear All, I will sit down with Paul next week in China to try to resolve this (I think we are getting there). I would vote to have the 'source' field as simple as possible. If it is not used by automated tools but just for a quick look at the file (say via a ncdump -h) then we should not worry too much about it. The CMIP5 version:
source = “CCSM2 (2002) atmosphere: CAM2 (cam2_0_brnchT_itea_2, T42L26); ocean: POP (pop2_0_ver_1.4.3, 2x3L15); sea ice: CSIM4; land: CLM2.0”
is too specific and opens the door for mismatch (as pointed out above). Maybe an alternative would be to list just the realms:
source_id = “CCSM2” source = “CCSM2: cycle 2.1.14 (2002): atmosphere; ocean; sea ice; land” and the details would be found under the further_info_url
Eric
Hi Eric and all,
"source_id" will definitely be ingested and used by machine (DRS, ESGF, file names, directory names, further_info_url, etc.), but my view is that "source" would not be tracked by the infrastructure but provide human readable information telling us what model (and model components) produced the output. I would think that good practice dictates that most modeling groups record full identifying information about their model (and its components) whenever they save an output file. I would think they would want to carry that provenance information over to the files they write for the CMIP archive. The information we collect as part of the "source_id" can capture the provenance information and then concatenate it together into the "source" attribute. It won't be a requirement for groups to do this (as you say source = "CCSM2: cycle 2.1.14 (2002) might be sufficient for some groups, but others might want to include component model information (if that's their usual practice), and we're providing an option for that.
We'll need to make it clear what information is required and what is optional.
thanks, Karl
@eguil @taylor13 and I have worked through these templates and we're satisfied with the synchronization - closing
Folks on this thread please open up a new issue with more specific information about tweaks required (if they are)
Coordination is needed to avoid asking the same information twice to modelling groups.