data2health / contributor-role-ontology

This ontology provides contribution roles for use in crediting persons or organizations.
20 stars 4 forks source link

Should contribution roles be object properties? #14

Open mellybelly opened 5 years ago

mellybelly commented 5 years ago

if we are to use them in triples person <-contribution role-> artifact we will need to figure out the best practice for doing so.

marijane commented 5 years ago

In the original OpenRIF CRO, we had a Contributorship class which connected people with their contributor roles via the BFO bearer_of and inheres_in relationships and the VIVO relates and related_by relationships. We still have this Contributorship class, but it looks like it's parent class has been created as a native CRO class, rather than importing vivo:Relationship.

This seems like a candidate for a potential refactor. I've never been a fan of vivo:Relationship and its properties. First, it is time-bound, which means it's an occurrence, not a continuant, and is placed in the wrong part of BFO. Second, "relates" and "related_by" are not semantically meaningful properties -- they're essentially the same as a Related Term (RT) relationship in a thesaurus.

I think it also requires thinking about what contexts we want people to use the CRO in, and what ontologies they're already using, if any. This gets back to my desire to have multiple versions of the CRO designed to work with different ontologies and vocabularies that contain a Role concept. If we want CRO to be used with ontologies that use BFO, I think we'd want to define at least one process for roles to inhere in. If we want CRO to be used with schema.org structured data, we might need to take a different approach to work with schema:Role (and we might also want a different set of annotations for the ontology itself in those situations, so schema.org users wouldn't need to worry about BFO/IAO). If there are contexts people might use the CRO in that do not involve a Role concept, that might call for implementing them as object properties. I can visualize what any of these things might look like, but I am still figuring out how to get there using the ODK.

mellybelly commented 5 years ago

I do think we need consistency with OBO ontologies, we could have a contribution process in OBI? I also completely agree that we ALSO need consistency with schema.org and their roles. I am not opposed to releasing this as multiple flavors. We could also think about how it relates people to objects in contexts like biolink as well- which is ontology neutral wrt to serializations, perhaps something similar here?

@cmungall or @balhoff perhaps you have ideas for us here?

cmungall commented 5 years ago

I think it would be convenient to make triples artefact property personURI. E.g. in Noctua we may want a reviewer property between a model and a curator.

But I think it's good to have a more granular representation so things can hang off the role instance

However, it can be confusing if you publish shadow URIs. We have done this in RO but it causes confusion.

diatomsRcool commented 5 years ago

Not sure if this helps.... https://docs.google.com/document/d/11ikZN5hGFqqaCdElaqP-K-_xHaLn6SftBHzaWUTKIH4/edit#

These are recommendations developed via the Research Data Alliance. We are also working with VIVO to add Activity classes.

There's a PROV way to do roles and a VIVO way to do roles.

diatomsRcool commented 5 years ago

FYI - our work with ORCID and RDA is paying off. We'll have specimens on ORCID profiles soon. We have a pilot that is working. ORCID is adding a physical object work type to their schema.

mellybelly commented 5 years ago

@diatomsRcool can you make sure all the roles you need are in CRO? also could use some help thinking about OWL punning so that we can use CRO as object props vs. classes.

nicolevasilevsky commented 5 years ago

@diatomsRcool here is the issue tracker for CRO: https://github.com/data2health/contributor-role-ontology/issues if you need to request new terms.

cmungall commented 5 years ago

Punning the Class and OP is valid: https://www.w3.org/TR/owl2-new-features/#F12:_Punning

You can even have a property chain to infer the simple triple (provided you use quite specific relations otherwise it gets complicated)

Prefix: : <http://x.org/>

Ontology: <http://x.org>

## ====================
## SIMPLE TRIPLE APPROACH 
## ====================

# Requires our test role to be an OP
ObjectProperty: creator_of
  Annotations: rdfs:label "creator of"

# Example Triples
Individual: Person1
  Facts: creator_of Work1
Individual: Work1

## ====================
## ROLE INSTANCE APPROACH
## ====================

## NOTE: this is punning
Class: creator_of

## OPs for connecting role instance to
## person and work
##
## The actual OP names are irrelevant here.
## This could use a BFO role approach
## (inheres-in, and a shortcut for realizes-has-output).
## My favored approach is to treat the role as
## process (i.e the process of creation enabled by
## a single individual).
## here roles are like MFs in a GO-CAM.
## But structurally it's all the same
ObjectProperty: has_person
ObjectProperty: has_work

Individual: RoleInst2
  Types: creator_of
  Facts: has_person Person2, has_work Work2
Individual: Person2
Individual: Work2

## Inferring simple triples from role-instances

ObjectProperty: creator_of
  SubPropertyChain: inverse(has_person) o has_work

Note that punning classes and properties reveals a few bugs in some toolchains, I will report separately

LisaOKeefe1 commented 5 years ago

@cmungall thanks so much for your contribution here! We really appreciate it.

cmungall commented 5 years ago

hmm Ignazio says this is in fact not valid, trying to find out more..

mbrush commented 5 years ago

Hi all. Thought I'd share how we modeled Contributions in the SEPIO model, as I dealt with similar questions and considerations as those described above. For SEPIO we needed a way to precisely model contributions an agent made to an artifact such as an assertion or piece of evidence - including the identity of the contributing agent, methods they followed, roles they played, organizations acted on behalf of, and when and where the contribution occurred.

We settled on an approach that links a generated artifact to a contributing agent via an instance of a Contribution class - a processual entity that represents "the actions taken by a particular agent in the creation, modification, assessment, or deprecation of an artifact." We called this the 'qualified contribution' approach - and it is documented in more detail here.

So how does this relate to CRO? . . . As is, CRO classes could provide the roles need to populate the 'realizes' attribute of Contributions (effectively punning CRO classes in the data). I don’t think we would ever need to instantiate CRO role classes in the SEPIO model, because all we need to describe can be captured using qualified contributions. If CRO moved toward treatment of contributions as processual entities, then the SEPIO 'Contribution' class could end up being equivalent to the root CRO 'contributor role' class.

mellybelly commented 5 years ago

@nicolevasilevsky @kristiholmes @mbrush 's proposal is not dissimilar from the model we discussed at Rocky, where the role is reified and is a class. It would be good to all get on the phone together and discuss. If this can work for GA4GH, GO, CD2H, OBO, then it will be golden.

cmungall commented 5 years ago

Matt, you have a contribution potentially having N agents. I would think of this always being a single agent. Even for a group contribution, the group is the agent. (you could break this down into contibution parts, and the group into its members)

Under this, is there a need for both a role and a process? Isn't a contribution always a realization of a role, and in 1:1?

(yes you are right, I have mentally translated contributions to molecular activities, agents into protein complexes of gene products, and am trying to go-camify you model, sorry...)

On Thu, May 30, 2019 at 7:26 PM Melissa Haendel notifications@github.com wrote:

@nicolevasilevsky https://github.com/nicolevasilevsky @kristiholmes https://github.com/kristiholmes @mbrush https://github.com/mbrush 's proposal is not dissimilar from the model we discussed at Rocky, where the role is reified and is a class. It would be good to all get on the phone together and discuss. If this can work for GA4GH, GO, CD2H, OBO, then it will be golden.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/data2health/contributor-role-ontology/issues/14?email_source=notifications&email_token=AAAMMOOBAQWF54KGOTE7V5TPYCEFPA5CNFSM4HCXBZLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWUAUWY#issuecomment-497551963, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAMMOIRVLGHGTPPN5OMGQ3PYCEFPANCNFSM4HCXBZLA .

diatomsRcool commented 5 years ago

Under this, is there a need for both a role and a process? Isn't a contribution always a realization of a role, and in 1:1?

I take your point, but I suspect this may not always be the case. Could someone engage in a paper correcting process as an author, an editor, or a reviewer? This is a trivial example, but I've had people from the biodiversity collections space be adamant about not including roles and others who have been adamant about including them. It may be an issue when talking about physical specimen management. The more flexible we can be the better.

mbrush commented 5 years ago

Matt, you have a contribution potentially having N agents. I would think of this always being a single agent. Even for a group contribution, the group is the agent. (you could break this down into contibution parts, and the group into its members)

The SEPIO model would naturally support such a representation – where a Contribution is 1:1 with an agent, and this agent could be a Group/Organization comprised of many individual persons. But you could instead or in addition represent more specifically the individual contributions of each member, if desired. See below.

Under this, is there a need for both a role and a process? Isn't a contribution always a realization of a role, and in 1:1?

Typically, the Contribution process realizes a single/specific role played by the contributing agent - in which case Contribution to Role is 1:1. If this was always the case, our model could drop the 'realizes: Role' attribute and type the Contribution to reflect the specific role being played (e.g. instead of a single Contribution Class, we would have a hierarchy of role-based Contribution classes, e.g. 'Creator Contribution', 'Editor Contribution'. . . . Or we could treat Contributions as Role instances instead of processes).

We chose not to do this for a couple reasons. First is that we preferred a single Contribution type for simplicity at this level, and allowing custom definitions of Roles in the value set part of the model. This enables maximal flexibility and customization - i.e. so a given SEPIO profile/implementation can create a value set containing the contributor roles it needs and bind this to the realizes attribute. This is more a stylistic preference however. The more substantive reason we went this direction is that we want to allow for Contribution instances that describe more than one role played by a single Agent (where Contribution to role is NOT 1:1). This I think speaks to the flexibility that @diatomsRcool describes above, and it lets us avoid proliferation of Contribution instances when we want to capture the fact that Agent X played many different roles in the creation of Artifact Y).

So, a Contribution instance is always 1:1 w.r.t. its Agent, and typically but not necessarily 1:1 w.r.t. a Role played/realized. This is reflected in the definition of the Contribution class as "the actions taken by a particular agent in the creation, modification, assessment, or deprecation of an artifact". There may be many actions here that realize different roles - that could be grouped into a single Contribution object if desired. But if it was important to track when/where/for whom each role was realized, you would need to split things into separate Contributions for each role realized.

Bottom line, the SEPIO model gives flexibility to collapse roles where desired, but split them out for more granular characterization where needed. I'm sure this approach has potential issues/challenges as well, but it is what we went with initially. There is still time to evolve this a bit if necessary. Finally, no apology necessary for your mental go-camification - I think it is an apt and perhaps informative analogy.