HumanBrainProject / openMINDS

openMINDS comprises a set of metadata models for research products in the field of neuroscience.
MIT License
40 stars 13 forks source link

Revision of ResearchProduct & ResearchProductVersions #17

Closed lzehl closed 3 years ago

lzehl commented 3 years ago

This issue is directed to @apdavison @UlrikeS91 @skoehnen @bweyers & @jagru20 (@apdavison could you please also include Peyman into the discussion? I don't know his GitHub alias...)

Task: Please go through the issue (ask questions if you have them) and provide feedback (agreement or objection) sometime within the next week; March 15 - March 19 (or let me know if you need more time to think this through).

@olinux and I had an in-depth discussion around the relation between ResearchProduct (Dataset, MetaDataModel, Model, Software) and ResearchProductVersion (DatasetVersion, MetaDataModelVersion, ModelVersion, SoftwareVersion) in respect of DOI versioning and the KG Search visualization.

Currently the situation is the following:

@olinux and I now made the following assumption:

For this reason, @olinux and I would like to simplify the schemas in the following way:

What do you think about this suggestion? Would this collide with your use cases or would it actually make them easier to handle? We are looking forward to your thoughts!

apdavison commented 3 years ago

Can the description of the ResearchProduct be modified after the DOI is created? e.g. to add a summary of new features added in more recent versions.

jagru20 commented 3 years ago

Thank you for this approach! While working on the migration scripts I noticed a certain redundancy between the properties you mentioned. From what I can say, your suggestions would make the schema more intuitive and would also fit our use cases very well. This is especially true, as many of the software versions do not provide a description. Therefore, I strongly support these changes.

One minor question though:

and (if provided) replaced and complemented with the "alternative name" and "version innovation", respectively.

I assume, this also applies to the representation of the information in the KG? (As it is right now?)

jagru20 commented 3 years ago

Another thought: I do not know if the digitalIdentifier really needs to be required. Many of the software entities do not have such an identifier. Or do we register a DOI for every instance blanketly? In the past, I had the impression that this was not wanted from the HBP perspective, but this may be a thought for a different issue.

Peyman-N commented 3 years ago

I think the current way of handling the DOI is the most optimal solution but there is another way to handle it too. If I recall correctly we can implement different versions of the same document in the DOI metadata record (I am not sure about this and I couldn't find any useful information in the DOI documentation).

Regarding the DOI of the ResearchProductVersion and ResearchProduct. I want to know how the DOI of ResearchProduct is going to behave. Is it going to point to the DOI of the most recent ResearchProductVersion or it is going to show a list of all the ResearchProductVersions or it is simply going to point to the ResearchProduct object.

Concerning the full name. After the final version of the ResearchProductVersion has been uploaded and the ResearchProduct is finalized, how are we going to update the full name in the ResearchProduct; if there was a name change between the ResearchProductVersions. Are we even supposed to update it?

jagru20 commented 3 years ago

I think the current way of handling the DOI is the most optimal solution but there is another way to handle it too. If I recall correctly we can implement different versions of the same document in the DOI metadata record (I am not sure about this and I couldn't find any useful information in the DOI documentation).

I think what you mean is the <relatedIdentifier> tag, specified by the relationType property? see p. 46ff in https://schema.datacite.org/meta/kernel-4.3/doc/DataCite-MetadataKernel_v4.3.pdf

UlrikeS91 commented 3 years ago

I’m not really on board with this proposal. For datasets, I cannot come up with a lot of examples, where the principal idea is to regularly create new versions of the same dataset. Correct me if I’m wrong, but usually the procedure is: I come up with a hypothesis, I test the hypothesis with an experiment, I get an answer. Yes, there are cases where the answer won’t be clear. Then you may adjust your experiments and try again. But generally, I doubt that most researchers would start a project thinking they will redo experiments over and over again until all details are perfected, especially if the first collection gives a clear answer already. So, in most cases there won’t be a second version. Why would I create a conceptual shell for a dataset that will never need one? Especially if the conceptual shell will have the same DOI while there is only one version.

The way I see it, in those cases I wouldn’t want to only see the concept description either but rather the description of the only version available (with the exact details). And even more so, this applies with or without plans for new versions. One example of a dataset that is planned to be updated regularly are brain atlas related datasets (e.g. containing the delineations). These are usually updated based on the latest knowledge in the field. So, let’s take the Waxholm atlas of the rat brain (WHS) as example, and imagine v1.01 is the only version so far and we don't allow updates to the general dataset description. This version contains delineations of 76 major brain regions. That’s crucial information, but I shouldn’t write this in the dataset description. Reason is that I already know that e.g. the cortex (which doesn’t have substructures in v1.01) will be split into several more regions. Therefore, 76 won’t be the final number. Then the only place to add it is in the “fullDocumentation” or under “versionInnovation” if the property name is changed (innovation feels wrong when it's a first version – everything is novel there). Alternatively, I could write the description in a way that will give the user the information indirectly, e.g. “The dataset contains delineations of the rat brain starting with 76 major brain regions…”. That will look odd for as long as the second version isn’t published yet. Because nothing has changed about that yet (specifically for this example v1.01 and v2 are around a year apart) and even when it did, the first version will inherit this as description (which will make little sense). I think it would make more sense to have the first version standing entirely on its own (without a dataset, just the datasetVersion) until a second version exists. One could allow changes to the researchProduct description, as @apdavison also raised as a question, but I’m personally against allowing this. This would mean that every time the researchProduct description is changed, one would need to ensure that the changes won’t affect previous versions especially when the all researchProductVersions inherit this description. It seems like a lot of maintenance work and would generally not be best-practice, I think.

Not allowing changes could also cause other issues (next to the ones described for the WHS example). People could write them vaguely so that it will surely fit any potential new version (worst case also the ones that shouldn’t be versions but rather new researchProducts) or would have to make new researchProducts when the new version, which should in fact be one, doesn’t fit the original description.

More of an FYI: The way the model is built up now, researchProducts do NOT have to have versions or vice versa. If we were to adapt this proposal, we would need to change this accordingly. Otherwise, it would be theoretically allowed to have a researchProductVersion without a fullName or a description.

lzehl commented 3 years ago

@apdavison @jagru20 @Peyman-N @UlrikeS91

Thank you all for responding so quickly and starting the discussion on the topic. I (hope I) will try to address all of your points / concerns in the following.

Updating full name (title), description and authors after publication/DOI assignment

General overview on DOI versioning

Considering ResearchProducts with only one expected Version vs with multiple Versions

Other concerns

I hope my response resolved some of the concerns? Please let me know your thoughts and continue with the discussion :wink:

apdavison commented 3 years ago

Thanks for the feedback @lzehl. I think these changes would mostly be welcome for models, although on balance I think I would argue to keep the "description" field for ResearchProductVersion but have it optional.

In most cases it would be empty, and when displayed the RPV would show the description from the associated ResearchProduct, but in cases where the ResearchProduct description changes radically when a new version is released I think it is important to preserve the original description, by copying it to the description fields of previous RPVs as needed.

lzehl commented 3 years ago

@apdavison @jagru20 @UlrikeS91 @Peyman-N

If we keep (an optional) "description" for the RPV (thanks for that abbreviation :sweat_smile: ) do we need to keep the "version innovation" in addition? I fear that "description" + "version innovation" might lead to the fact that only one is used.

Originally our idea was that the "version innovation" field can take over the "description" for the RPV (also for a full description if it changed drastically), but I see that this might not be intuitive and maybe not even fully correct in the meaning of the term. Although I would still prefer it.

What do you think?

apdavison commented 3 years ago

I think you would need clear instructions. Ideally "description" would not be edited manually, but filled automatically when changing the description of the parent ResearchProduct.

skoehnen commented 3 years ago

Originally our idea was that the "version innovation" field can take over the "description" for the RPV (also for a full description if it changed drastically), but I see that this might not be intuitive and maybe not even fully correct in the meaning of the term. Although I would still prefer it.

I like that this adds versioning to the description.

Wouldn't it make sense to move full name and description completely to RPV and make at least one RPV mandatory for all Research Products? That way we would not need to think about different types of descriptions (description and versionInnovation) and names. And then just take the last entry in the chain of versions. Say a new version doesn't introduce changes to the description the description is taken from the last version with description entry.

UlrikeS91 commented 3 years ago

Wouldn't it make sense to move full name and description completely to RPV and make at least one RPV mandatory for all Research Products? That way we would not need to think about different types of descriptions (description and versionInnovation) and names. And then just take the last entry in the chain of versions. Say a new version doesn't introduce changes to the description the description is taken from the last version with description entry.

I like this idea! We could even flip the original proposal around so that RP has an "alternativeName" field and an "alternativeDescription" field (both optional). If they aren't filled out, the "fullName" and "description" from the latest RPV are inherited instead. Then, the RPV could keep the "versionInnovation" to point to specific novilties of the exact version. This way, the researchers could choose themselves if they want to have a general "description" that fits all versions (which would also suit the RP) and state novilties under "versionInnovation". Or they could write new descriptions for each version that are more detailed and specific to the different versions, have keypoints under "versionInnovation" and add an "alternativeDescription" to the RP (or not if they don't want one).

Based on @lzehl comments here:

  • [...] @UlrikeS91 therefore I believe the suggested changes would actually reduce also for datasets a lot of redundant work especially for the ones that only will ever have only one single version.
  • As soon as a second version is attached to the ResearchProduct, the full name and description of that ResearchProduct may be changed (to better address both attached versions) and a separate concept DOI is released that will point to the landing page of the ResearchProduct. Version specific details on the description or more specific or first versions of full names can be provided within each Version using the "alternative name" and "version innovation" fields. [...]

This approach would be suitable as well, but less centered around the RP itself. It would give more flexibility while reducing the overall redundancy. But instead of having to adjust the RP with every new version, one can give a more general alternative name and description once it becomes clear where this is going. The RPV should always be described well. I fear that the proposed changes, where both "alternative name" and "version innovation" are optional fields on the RPV, could affect this to the negative.

Additionally, as I said and you commented on, RP and RPV didn't require each other in the previous version of the schemas, so that having components (RP+RP (or RPV+RPV)) would be equally ok as having versions (RP+RPV):

  • As @UlrikeS91 pointed out: within the ResearchProduct the hasVersion is currently not required, because a ResearchProduct only must have either a Version attached, or a Component attached (which is a dependency we did not yet have to define via our schemas, but left it to curation). We could change this, though, to a ResearchProduct having to have a Version attached, and optionally can also have a Component attached. This would force the component metadata model (cf. the discussion and decision made in HumanBrainProject/openMINDS_core#163) to a certain structure, but would maybe make the overall differentiation between Versions and Components more clear.

But I think that it would make generally more sense to have the requirement that a RP has to have at least one RPV. I find it strange that an RP can exist without having at least one version. "hasComponents" should stay optional. @lzehl, could you explain again why it made sense to allow either component or version? I know that one needs to be there, but why would it be ok to have a RP, which is more of a conceptual construct, without the details for at least one specific version?

lzehl commented 3 years ago

@skoehnen & @UlrikeS91 making at least one RPV mandatory for a RP is indeed something we could introduce. I did not so far because it would force the structure for components to the same thing: each component has to have a version then. It is not something I'm opposed to, I would even like this strict rule. However, it makes even simple component structures quite complex (Example for following this rule: one collection RP has 3 component RPs, all 4 RPs have to have at least one RPV, the RPV attached to the collection RP has at least 3 components attached, namely the RPVs of the 3 component RPs).

As a reminder for all concerning the RPVs listed in the RP property "hasVersion" A simple inheritance rule of "description" and "full name" between RPVs listed in the RP property "hasVersion" is a bit more difficult, because, in principle, we have two types of RPVs that are identified by the cross links between them: (1) Linkage over the RPV property "isNewVersionOf", identifies the previous outdated version for a RP (2) Linkage over the RPV property "isAlternativeVersionOf", identifies any alternative, but equally usable version for a RP

The point of having a "description" and "full name" explicitly in the RP is to improve what was done so far (cf. above) and provide an actual (always existing) landing page for the RP. Therefore I want to keep it there. Maybe one thing did not became clear in the beginning of this issue: The RP is the main landing page that will be visualized in the KG Search (also in order to get rid of the display of all the versions in the search as we have them right now). The RPVs will be reachable over that RP or through an explicit version search. What to display on the RP landing page is currently nearly impossible to decide on because we never know what is more correct / representative or completely redundant to display...

@apdavison making the "description" of the RPV automatically filled from the RP at the time point it changes, would mean that only the first RPV of an RP will have a "description" (the original one) and all other RPVs that follow will not have a "description", wouldn't it? (because every later change in the RPV will not be representative for one specific version) I'm not sure if this is helpful.

What would speak against using the "versionInnovation" field for keeping track of first descriptions of the RP? (first RPV either leaves "versionInnovation" blank or gets the first "description" of the RP copied in, if the attachment of a second RPV drastically changes this description in the RP; second, third, etc RPV only lists in "versionInnovation" the version novelties)

An alternative to this: If we want to keep track of the changes, we could within the KG system provide a change-log for the "description" and "full name" of the RP as an additional feature. If necessary, we could even introduce this as a property in the RP pointing to a change-log schema (with "dateOfChange", plus the properties where we want to track the changes).

I want to argue strongly for a reduction of possible descriptions of versions to reduce redundancy, but also to avoid the misuse of "description" being used instead of "versionInnovation" and vise versa. That could also mean to kill the term "versionInnovation" in the RPV and leave the "description" as optional field if the more broader term is preferred. The general description and full name has to remain in the RP, because that is what is going to appear first in the KG Search. And I do not like to just point to the latest version for two reasons: (1) the latest version is not representative for all other versions in the RP, and (2) in case it has an alternative version it is unclear again which should be preferred.

lzehl commented 3 years ago

btw: thanks everyone for participating in this discussion! I really appreciate all your input and arguments in order to come to a good solution that will serve all (including the KG Search implementation). Please continue :smiley: :+1:

lzehl commented 3 years ago

@apdavison @UlrikeS91 @Peyman-N @jagru20 @skoehnen @olinux

In order to proceed with this issue I'd like to make the following proposal based on the discussion above with these rules/structural changes (I've numbered them to make it easier discussing individual points):
1) The RP has to have at least one RPV, meaning the property "hasVersion" is required (count 1-n).
2) The RP can have one or multiple components (link to another RP), meaning the property "hasComponent" is optional (count 1-n, when used). If an RP has a component it is interpreted as a collection. This collection is (also) versioned, meaning the attached RPVs under "hasVersion" are interpreted as collection versions. 3) The RPV can have one or multiple components (link to another RPV), meaning the property "hasComponent" is optional (count 1-n, when used). If an RPV has a component it is interpreted as a collection version. 4) The RP first receives it's own DOI when a second RPV is attached (in "hasVersion"). When only one RPV is attached it receives the DOI of the RPV. The "alternativeName" and a "description" in the single attached RPV should be blank or identical to the "fullName" and "description" of the RP (see also point 5 and 6). 5) The RP will require a "fullName" and "description" property. This fullName and description should be representative for all attached RPVs. Changes (if anavoidable) can always be made (DataCite keeps track of such changes). Future (maybe): we keep track of such changes within the KG as well. 6) The RPV optionally can have an "alternativeName" and a "description" property. The "description" should contain a description of the novelties of the version compared to the previous one or in extreme cases hold a full specific description of the version. General rule for curation of "alternativeName" and "description" : avoid redundancy (leave blank if not needed) and use only to capture changes. 7) The RPV property "versionInnovation" will be deleted. Version specifications/novelties can be described in "description". 8) The RPV property "fullName" will be deleted. A version specific name can be stated in "alternativeName".

Would that be acceptable for everyone (meaning does it improve your use cases or does it complicates them)?
@olinux are the technical concerns still addressed in that proposal? It would require the design of a landing page for the RPs.

For MODELS: @apdavison did move some properties to the RP, because they would be the same across all attached RPVs.
For DATASETS & SOFTWARE: @UlrikeS91 & @jagru20 do you think we should transfer some properties from the RPV to the RPs in order to avoid redundancy? (note that this change could also be done later when we see this redundancy in the registered instances)

olinux commented 3 years ago

Here are my comments: It's highly important that it is clear for both, the producers and the consumers of metadata that defining a description on the RPV shall be interpreted to fully replace the description of its RP (if there are pieces which should be taken from the RP description, they have to be copied over). To make this even more clear, I would suggest to rename "description" to something like "versionSpecificDescription" for the RPV so we can clearly state in its documentation that this property should be priorized against the "description" property in related RPs. For "alternativeName": schema.org calls this "https://schema.org/alternateName", but this doesn't really reflect the "power"of this property either (since it also has an implicit override of the name of the RP). I could imagine that a renaming to "versionSpecificName" or similar would make this fact more transparent too.

Other than that, I think it would work well.

lzehl commented 3 years ago

@olinux note that in the upper proposal the RP description cannot be replaced by the RPV description if it only holds a description of version innovation. It should be dealt rather in the following way for DOI and display: RP description + RPV description. That again will look a bit weird if the RPV description can stand on its own.

mh... to get this clean would mean that we cannot delete the RPV "versionInnovation" property and need to make sure that this is used instead of the RPV "description" property for capturing descriptions of version novelties. In addition the PRV "description" should be kept empty except drastic changes occurred that need a full blown description.

Correct?

Renaming the RPV properties "versionSpecificDescription" and "versionSpecificName" could still be considered although that could also be dealt with in an instruction I think (to avoid introducing new property names that basically mean the same thing... which also means that we could stick with "fullName" in principle)

UlrikeS91 commented 3 years ago

Agreed without further comments: 1., 4. - 5., 8.

  1. The RP can have one or multiple components (link to another RP), meaning the property "hasComponent" is optional (count 1-n, when used). If an RP has a component it is interpreted as a collection. This collection is (also) versioned, meaning the attached RPVs under "hasVersion" are interpreted as collection versions.
  2. The RPV can have one or multiple components (link to another RPV), meaning the property "hasComponent" is optional (count 1-n, when used). If an RPV has a component it is interpreted as a collection version.

Point 2 says: RP1 --hasComponent--> RP2 == collection | req | req RPV1-----------------------RPV2 == collection

Point 3 says: RP1 RP2 not a collection? | req | req RPV1--hasComponent-->RPV2 == collection

Question: What will be the relationship of the RPs for point 3? They won't be a collection then? Given your first point, RPVs need to have an RPs now.

  1. The RPV optionally can have an "alternativeName" and a "description" property. The "description" should contain a description of the novelties of the version compared to the previous one or in extreme cases hold a full specific description of the version. General rule for curation of "alternativeName" and "description" : avoid redundancy (leave blank if not needed) and use only to capture changes.
  2. The RPV property "versionInnovation" will be deleted. Version specifications/novelties can be described in "description".

Based on @olinux comment:

It's highly important that it is clear for both, the producers and the consumers of metadata that defining a description on the RPV shall be interpreted to fully replace the description of its RP (if there are pieces which should be taken from the RP description, they have to be copied over).

There might be a discrepancy here. This "alternativeDescription" for the RPVs makes sense. It would also make sense to me that this alternative won't extend the original one from the RP itself (as @olinux commented). However, the point of a new version is that something changed justifying the creation of a new version. And to me, these changes should be described. Preferably in a few sentences and as visibly as possible (meaning displayed on the KG entry and not only in the e.g. data descriptor or other full documentation). This would mean that I would use the "alternativeDescription" everytime to describe the novelties for that new version. Then I would need to copy over the RP description and add a sentence to describe the changes/novelties. I think this was not the point with this porperty since this would cause the redundancy, which @lzehl described to be avoided under point 6.

Maybe it would make more sense to keep "versionInnovation" then? And have "alternativeDescription" (or whatever it will be called), in fact, as a replacement for the RP description for this RPV.

For DATASETS & SOFTWARE: @UlrikeS91 & @jagru20 do you think we should transfer some properties from the RPV to the RPs in order to avoid redundancy? (note that this change could also be done later when we see this redundancy in the registered instances)

datasetVersion properties:

*In many cases, the modalities will be the same for all versions. What we need to figure out is whether a change of the modality list would mean a too big change to be registered as a new version. Example 1: Version 1: I did ephys recordings on a head-fixated mouse (region A) while stimulating region B. (modality: electrophysiology (and others probably, not the point)) Version "2": Same ephys recordings but now with mice performing a task (headfixated or not). (modality: electrophysiology + behaviouralApproach) Example 2: Version 1: I did behavioral experiments on humans and only observed what they did. (modality: behaviouralApproach) Version "2": Now I had them do the same task, but in a scanner. (modality: behaviouralApproach, neuroimaging)

I guess my point is where do we draw the line? Would we allow the version "2" to be a new dataset version or would this be too much? Maybe the examples are bad, but it might be possible to have a case that should be ok.

lzehl commented 3 years ago

This comment is to clarify @UlrikeS91 question about point 2 and 3, because I did not write them well. Here is a correction / explanation of what I meant:

for 2) The RP can have one or multiple components (link to another RP), meaning the property "hasComponent" is optional (count 1-n, when used). If RP1 has a component (->RP2), RP1 it is interpreted as a collection. According to point 1, RP1 also has to have at least one version (e.g. RPV1.1). All RPVs of RP1 are interpreted as collection versions.

for 3) The RPV can have one or multiple components (link to another RPV), meaning the property "hasComponent" is optional (count 1-n, when used). If RPV1.1 has a component (->RPV2.1), RPV1.1 has to be a collection version and its parent RP1 is a collection with at least one component (RP2) that has at least one version (RPV2.1)

As help here again the full metadata model for components: https://user-images.githubusercontent.com/6161552/107954727-1f438780-6f9d-11eb-88ed-fb888ddf8ebc.png

UlrikeS91 commented 3 years ago

This comment is to clarify @UlrikeS91 question about point 2 and 3, because I did not write them well. Here is a correction / explanation of what I meant:

for 2) The RP can have one or multiple components (link to another RP), meaning the property "hasComponent" is optional (count 1-n, when used). If RP1 has a component (->RP2), RP1 it is interpreted as a collection. According to point 1, RP1 also has to have at least one version (e.g. RPV1.1). All RPVs of RP1 are interpreted as collection versions.

for 3) The RPV can have one or multiple components (link to another RPV), meaning the property "hasComponent" is optional (count 1-n, when used). If RPV1.1 has a component (->RPV2.1), RPV1.1 has to be a collection version and its parent RP1 is a collection with at least one component (RP2) that has at least one version (RPV2.1)

As help here again the full metadata model for components: https://user-images.githubusercontent.com/6161552/107954727-1f438780-6f9d-11eb-88ed-fb888ddf8ebc.png

Yes, great! That's what I hoped :) It wasn't 100% clear from how you wrote it first. Thanks for explaining it again in more detail :)

jagru20 commented 3 years ago

Hi all, I just discussed with @bweyers and we came to the following points/questions:

for 3) When implementing this, would the hasComponent-link between the RPs be done automatically after a RPV has a component added? This would save a lot of unnecessary work for the curators.

for 4) Registering the Concept DOI later seems problematically as it is not clear what you cite when citing the DOI of the first version. Two DOI for an RP with just one RPV seems obsolete, when assuming that there will never be another version. But we can not ensure there will never be a successor version, so the following problem arises: If you want to cite the RP (meaning the concept DOI) in a publication and use the first DOI for it, your citation will become faulty at the point in time where a second RPV is added to the RP, because it then cites something different than you intended. I think this somehow undermines the idea of the DOI. I also think that this applies not just for software but becomes more obvious here than in datasets. We strongly prefer two separate DOI for RP and RPV from the start.

What about adding an option to mark a RP as single-version to ensure that there will not be a second RPV in that RP? That way you could maybe save one DOI if you really want to.

lzehl commented 3 years ago

@jagru20 & @bweyers thanks for that valuable feedback.

for 3) automatically creating the graph links for the RP/RPV component model might be something we can setup in the KG. I agree that supportive setups like this will facilitate the work of a curator. But as far as I know these setups require a validation of an expected graph structure which is not trivial to do and may not be implemented right now. @olinux what do you think?

for 4) we discussed this issue as well. In principle, we neglected that issue for the following reason: when a user at a certain point in time used data from an RPV and that RPV was the only version of an RP at that time, the DOI will lead to the data that the user actually did use even if he/she took it from the RP. If later in time the RP actually received more RPVs, the user from before could not have planned those additions and therefore he/she could not have meant to reference the RP in that sense. Also if there are two DOIs from the start there is also the risk the other way around: meaning a user uses the DOI of the RP but actually should have used the DOI from the RPV, or gets totally confused because there are two DOIs for the same thing. Nonetheless, I think that part could be improved. See suggestion below.

proposed change A for 4) The RP first receives it's own DOI when a second RPV is attached (in "hasVersion"). When only one RPV is attached the RP in principle does not have any DOI (meaning the RPV DOI is displayed on the RP, but with a disclaimer stating something like this: "This RP only contains one RPV. The provided DOI is representing the RPV and not the RP. The RP receives it's own DOI referencing all versions first when a second version is attached."). The "alternativeName" and a "description" in the single attached RPV should be blank or identical to the "fullName" and "description" of the RP (see also point 5 and 6). @olinux would that work?

proposed change B for 4) The RP receives it's own DOI, also when there is only one RPV attached. The DOIs on the RP and on the RPV both receive an information disclaimer: for RP "The DOI of an RP can be used to reference all current and future attached RPVs" and for RPV "The DOI of an RPV can only be used to reference this exact RPV."

@ ALL: which option would you prefer? Please overlook the badly formulated disclaimers :sweat_smile:

jagru20 commented 3 years ago

In my opinion, and I hope speaking also for @bweyers here, I would prefer option B.

bweyers commented 3 years ago

Yes you do @jagru20

lzehl commented 3 years ago

Dear all, based on our discussion in the developer meeting we now came to the following conclusion:

For the "shared" properties between RP and RPV I would like to suggest this approach/workflow (defining inheritance rules for RP towards RPVs):

@olinux and also all others: would that fit? does this set clear rules for the inheritance between RP and RPV?

@UlrikeS91 did we agreed now to remove "developer" from the Dataset/DatasetVersion? @apdavison @jagru20 @bweyers do you need "authors" at all for the cards or would "developers" be sufficient (with them acting the same way as "authors" in case a DOI is assigned)?

UlrikeS91 commented 3 years ago

This issue seems solve (as much as possible at least). @lzehl should we close this for now?

lzehl commented 3 years ago

Yes. I think we can close this one. The Collection issue/discussion is covered elsewhere.