information-artifact-ontology / ontology-metadata

OBO Metadata Ontology
Creative Commons Zero v1.0 Universal
19 stars 8 forks source link

New annotation property: contributor affiliation #127

Open cthoyt opened 1 year ago

cthoyt commented 1 year ago

I want to have an appropriate predicate to use as an annotation on a dc:contributor annotation assertion that links the affiliation(s) of the contributor at the time of the contribution. Note that affiliations can change over time, but this should capture the affiliations of a contributor at the point in time that they made a contribution.

I demonstrated this in https://github.com/oborel/obo-relations/pull/708, which boils down to the following image:

Note that I used rdfs:seeAlso here - I think it makes sense to have an IAO predicate here, labeled "contributor affiliation". It was also suggested to use http://www.w3.org/ns/org#memberOf as a predicate

matentzn commented 1 year ago

Wrong repo, moving this.

matentzn commented 1 year ago

My sense is that we should not introduce a new property, and add a dc:contributor annotation alongside with the organisation identifier in parallel to the human creator. As affiliations between humans and orgs tend to fluctuate a lot, I also don't think the relationship between the person contributor and the respective affiliation needs to be tracked. So basically:

<:Anemia> dc:creator <orcid:123>
<:Anemia> dc:contributor <ror:999>

Whe the <:Anemia> dc:contributor <ror:999> is added as part of an SOP requirement of the respective ontology repo. Di you you think of this somehow differently?

graybeal commented 1 year ago

"affiliations between humans and orgs tend to fluctuate a lot" is a reason to include affiliation, not exclude it. A comment I make with affiliation A may be contradicted by a comment when I have affiliation B. The affiliations are very useful in that case.

Having said that, I know that thinking leads to an infinite regression of possibly relevant metadata. To me this one seems more relevant than most.

matentzn commented 1 year ago

@graybeal my proposal covers

  1. contributions done by individuals
  2. contributions done by organisations

but excludes the fact that the contribution done by organisation was _due to the contribution done by the individual.

Am I understanding you correctly that you would want to track which organisation an individual was part of the moment they made the contribution?

cthoyt commented 1 year ago

There is no such thing as a contribution by an organization. A human being must at some point interact with the data. They may be doing this in their capacity as a member of an organization, but there is no case ever where e.g. "Monarch Initiative" made a contribution to an ontology.

Therefore, to reiterate what I tried to write before and show in the screenshot at the top of the issue: I think we should never do <:Anemia> dc:contributor <ror:999>. We should exclusively write <:Anemia> dc:contributor <orcid:123>, then have an annotation on this triple that the organization under which orcid:123 was acting is ror:999. If it's the case that multiple people were behind a contribution, then there's no problem in writing multiple contributor annotations.

matentzn commented 1 year ago

There is no such thing as a contribution by an organization.

I am not unsympathetic to what you are saying! But I think that sometimes groups of people sit around a table to debate a definition, and just want to say that "their group did it" rather than the 10 individuals sitting around the table - this is certainly the case for many GO-related efforts. In any case, we don't need to decide whether or not <:Anemia> dc:contributor <ror:999> should be allowed - that is policing to a level that does not help us move forward. What we should think about is how to represent:

Nico added this synonym, and while doing so, he was a member of Monarch.

without going "too deep". Perhaps this is simply not possible, and we need to go with policies that encourage level 1 annotations like:

Nico contributed to this term, and while doing so, he was a member of Monarch.

cthoyt commented 1 year ago

okay good point, we just need this issue to be about how to support

Nico contributed to this term, and while doing so, he was a member of Monarch.

and not necessarily discuss the other possibility further

matentzn commented 1 year ago

Ok fair enough @cthoyt probably you are right.

@jonquet Do you have any suggestion for a suitable property that expresses the "was affiliated with" relation?

jonquet commented 1 year ago

For Organization the standard is the W3C Organization Ontology (which itself rely on FOAF) so the property would be: https://www.w3.org/TR/vocab-org/#org:memberOf and there is a reverse one too.

I would really recommend not using rdfs:seeAlso which should stay a fallback property avoided as much as possible if a more precise stills standard one can be find...

matentzn commented 1 year ago

Cool, thank you @jonquet. Just to be clear there was a small typo in your comment, @jonquet you are suggesting http://www.w3.org/ns/org#memberOf.

I am ok with that. We can try to get consensus here just to do due diligence, but I think I would like to push forward implementing your suggestion asap @cthoyt.

Ok, everyone else: I know most of you won't bother - and there won't be any requirement whatsoever to document organisational membership, nor will there be any restrictions on the attribution pattern.

@cthoyt can I call a vote on this:


To express the organisational affiliation during which a person has made contribution we suggest to:

  1. Capture the attribution on person level with dcterms:contributor or dcterms:creator: :A dcterms:contributor orcid:123.
  2. Capture the organisational affiliation as an axiom annotation:
    :A dcterms:contributor orcid:123.
    [] a owl:Axiom ;
    owl:annotatedSource :A;
    owl:annotatedTarget orcid:123;
    owl:annotatedProperty dcterms:contributor;
    org:memberOf ror:999
jonquet commented 1 year ago

Just to be clear there was a small typo in your comment, @jonquet you are suggesting http://www.w3.org/ns/org#memberOf. Yes sorry, in the hurry I used the URL of the browser rather than the URI ;)

cmungall commented 1 year ago

Keep it simple for the 99% case: A simple triple that can connects a term or an axiom to an individual or an organization/WG/project. Avoid reification unless you want to talk about an axiom.

There is no such thing as a contribution by an organization

this is permitted by dc:

dct:contributor - The guidelines for using names of persons or organizations as creators apply to contributors. .. and on creator: Examples of a Creator include a person, an organization, or a service

I don't really see the problem, but if we really want to model things at a more granular level, then we should do it properly using accepted standards, i.e. PROV.

Ad-hoc axiom annotation using properties in non-standard ways is the worst of all worlds. Awkward to implement, no one will use (without new Protege code), awkward to query, non-standard interpretation.

:A dcterms:contributor orcid:123.
[] a owl:Axiom ;
owl:annotatedSource :A;
owl:annotatedTarget orcid:123;
owl:annotatedProperty dcterms:contributor;
org:memberOf ror:999

this says the axiom is a member of the organization.

Also note that if we want to attribute an axiom then we are in the realm of 2nd order axiom annotation. Are we sure we want to go there?

In contrast PROV provides a simple, clear, standard way of providing the provenance at whatever level of granularity is required, allowing for multiple scenarios ranging from single editors through editors acting as part of a working group while having affiliation at a host organization and acting on behalf of a broader funded consortium, while following design patterns P and Q, being helped by chatGPT, etc.

Yes, no one will manually author a prov graph, but we can imagine a protege plugin that tracks the users current activity and provides relevant metadata. And I think a plugin would be required for the axiom annotation proposal.

jonquet commented 1 year ago

Well ... to be honest @cmungall exactly expressed what I was thinking of and was offline writing/preparing. I am not confortable with this idea of adding things to the contributor like that, especially because DCT is not very formal to... then I would have say too to use PROV then to capture the provenance info. I would suggest the same to:

matentzn commented 1 year ago

@jonquet that would not be enough. You would have to model the editing process as a process:Activity as well; in any case, none of this is an option. Neither do I think we will write a protege plugin, nor do I think it's worth adding 6 Provence assertions for every change to a term. The whole reason for this suggestion was to make it possible to formalise the way we model organisational contributions. And it seems that @cmungall you are against retaining the link between contributor and their organisation at the time of the contribution. I can see what you are saying here with memberOf being used on the assertion, but you know I hope that the intention was for this property to say something about the contributor. We use this pattern for synonym types (layperson synonyms, etc) subsets and probably many others. But i hear you that it is not entirely satisfactory.

So back to the drawing board. How should we encourage to represent institutional contribution? Just add a second triple saying that LBNL was a contributor?

graybeal commented 1 year ago

Edit: the following is wrong-headed, I lost context (wrong assumption that relationships won't change across the entire set of information)

could add header triples (as I think of them, following SKOSPlay, but I guess they are header statements) for each contributor or creator, indicating what organization they are representing. That only needs to be stated once per contributor, not in every mapping statement, and so can be declared up front for the entire set.

matentzn commented 1 year ago

@graybeal unfortunately this won't address our use case: giving precise attribution to organisations when organisational affiliations are prone to change..

graybeal commented 1 year ago

I'm sorry, but the use case of the first entry in this ticket says "predicate to use as an annotation on a dc:contributor annotation assertion that links the affiliation(s) of the contributor". I parse that as you have a contributor (it is a person), and you want to specify the affiliation of that contributor. So it exactly addresses that requirement (which I reiterated in my own comment near the top). The title also parses that way. And my solution addresses that explicitly, I think.

If you want to be able to specify an institution as the contributor, I thought that was directly handled (that the contributor can be either an individual or an institution. I'm used to dealing with DataCite patterns and understood that was our intended pattern, to accept either ROR or ORCID. If not, that's unfortunate IMHO.

matentzn commented 1 year ago

I parse that as you have a contributor (it is a person), and you want to specify the affiliation of that contributor.

Yeah we should have made this clearer. What it should have said:

you have a contributor (it is a person), and you want to specify the affiliation of that contributor while they were making the contribution (tomorrow the affiliation may change, but the affiliation during the act of contributing does not).

We should probably open a new issue and start from scratch, there is already too much confusion in here which I caused!

graybeal commented 1 year ago

Thank you for the clarification. What you said is by my understanding the same as what I said. Let me see what I am missing.

I am creating an SSSOM file. Everything I say in that SSSOM file is "as of now". Every annotation is as of now. So of course the affiliation may change in the future, we are merely capturing current knowledge.

If you are considering that someone may be consolidating a bunch of mappings, and making historical statements about those mappings, I think that is a separate scope entirely. It could be accommodated by enabling statements about each mapping source (or embedding time information within each mapping source). At that point it is the mapping source item that can have the historical information (when was this mapping performed? who were the principal contributors? who paid for it? etc.), and trying to allocate all that information to every mapping will create enormous information bloat.

Perhaps a general observation is that a way to make some arbitrary RDF statements in an SSSOM document might not be a terrible thing.

matentzn commented 1 year ago

Oops, sorry. This issue here is not about SSSOM - it is OBO Ontology metadata! Of course this whole question will apply to SSSOM as well - its pretty much the same discussion. For SSSOM we will certainly repeat this discussion, and when we do, I will ask you how I can have 2 mappings in the same mapping set, one I made while I was at Manchester University, and another while I was at Stanford! The use case being that Manchester wanting to know how many mappings it has financed into being :P

graybeal commented 1 year ago

ooh, my bad, lost my mind! shall we collectively delete or migrate the thread since here ?

matentzn commented 1 year ago

I will close the issue and open a new one once I had a chance to understand Chris comment above which I am not following.

cmungall commented 1 year ago

I'll answer in thread just to try and clarify, but happy to contribute to a new thread

if something boils down to 3 choices

  1. simple and not granular enough for 1% of cases (i.e simple triples)
  2. harder, non-standard, and higher granularity than 1 but not enough for 0.1% (i.e non-standard nth-order reification)
  3. harder, standard, and granularity at arbitrary levels (i.e prov)

then {1,3} >> 2

On Fri, Apr 7, 2023 at 10:32 AM Nico Matentzoglu @.***> wrote:

I will close the issue and open a new one once I had a chance to understand Chris comment above which I am not following.

— Reply to this email directly, view it on GitHub https://github.com/information-artifact-ontology/ontology-metadata/issues/127#issuecomment-1500491570, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOMCJDMRTEXWL74SPLDXABFUPANCNFSM6AAAAAAWSRWYZA . You are receiving this because you were mentioned.Message ID: <information-artifact-ontology/ontology-metadata/issues/127/1500491570@ github.com>

matentzn commented 1 year ago

Ok, in other words, are you saying that it is not worth capturing the fact that a contributor held a specific affiliation while making a contribution? You know that 3 is not a serious option right now.

This is totally fine by me as well. From your perspective we just add an SOP where needed to document a separate triple for the organisation (probably using dcterms:contributor again), right?

cmungall commented 1 year ago

I hold that 3 is no more impractical than 2, but I realize I may need to amass more evidence for my case

On Fri, Apr 7, 2023 at 11:58 AM Nico Matentzoglu @.***> wrote:

Ok, in other words, are you saying that it is not worth capturing the fact that a contributor held a specific affiliation while making a contribution? You know that 3 is not a serious option right now.

This is totally fine by me as well. From your perspective we just add an SOP where needed to document a separate triple for the organisation (probably using dcterms:contributor again), right?

— Reply to this email directly, view it on GitHub https://github.com/information-artifact-ontology/ontology-metadata/issues/127#issuecomment-1500525291, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOP7QFOYVZCY5WHYGNTXABPXBANCNFSM6AAAAAAWSRWYZA . You are receiving this because you were mentioned.Message ID: <information-artifact-ontology/ontology-metadata/issues/127/1500525291@ github.com>

matentzn commented 1 year ago

Its a bit exaggerated to lump 1st order reification and nth-order reification together, but ok. :D No one here is asking for solving the "added a synonym and while doing so was monarch" use case anymore - we have dismissed this. It's only about 1st order: I made a contribution while being monarch. A reasonable PROV graph has at least twice the complexities of 1st order reification (because we need to represent the activity that generated the Entity), and I am not even sure how the two relate for the use case. I guess most people that have not yet muted this thread will wonder about how a PROV based solution will really look like in RDF.