Open marcolarosa opened 4 years ago
Maybe it is a bit overkill using CRO ontology (https://github.com/data2health/contributor-role-ontology), but it should be considered.
Several efforts already exist to make taxonomy of common contributors roles in scholarly work. The most common one by journals is perhaps the CASRAI CReDiT roles used in JATS:
There are URIs for these in https://jats4r.org/credit-taxonomy, e.g. https://dictionary.casrai.org/Contributor_Roles/Project_administration so we should be able to add these to the RO-Crate context, however you may notice that unhelpfully these URIs don't resolve to a readable page at all, so following our own advice we would still need to add additional documentation links to the RO-Crate.
@jmfernandez links to CRO ontology which formalize this extends with lots of useful Research terms, e.g. http://purl.obolibrary.org/obo/CRO_0000068 is "conservator role; A person responsible for the preservation of artistic and cultural artifacts." - I think these could be good for the many uses of RO-Crates.
Unlike casrai.org, CRO have also made their URIs go to a textual page, but they have also gone for this OBO style names like CRO_0000068
that have almost no meaning without resolving the URL, so from the JSON-LD they would be almost meaningless. For rendering and humans they would therefore also need some textual declaration within the RO-Crate.
The SKOS method with ad-hoc properties in #71 could help for other ad-hoc roles not in CRO.
That would allow roles to be both defined and referenced within the RO-Crate - potentially later moved out if used many places.
That leaves how to relate the role to the person and the crate (or one of its resources), which I think https://schema.org/Role or https://schema.org/OrganizationRole can be used as an intermediary for all properties - see http://blog.schema.org/2014/06/introducing-role.html which describes this intermediate node concept well. Here https://schema.org/roleName can be both a URL to a concept or free-text - so we don't necessarily need to SKOS anything ad-hoc (unless the roles already come from a 3rd party taxonomy).
We could document this for https://schema.org/contributor which is probably where most of the arbitrary roles will work ("illustrator"), and secondly for organizational roles ("project manager") which has more to do with a person's affiliation.
Multiple contributors can take on multiple roles in forming a creative work.
To specify a role, break up contributor
with an intermediate Contextual Entity of type Role
that again links on with contributor
to the individual Person
or Organization
. Note that one individual may take part in multiple roles, but each role goes to just one person.
The role is specified using roleName
. For academic work, RO-Crate recommends using the CASRAI Contributor Roles Taxonomy (CRediT) and/or the Contributor Role Ontology (CRO). Free-text roles can be used as fall-back when no specific term is available. Multiple roleName
identifiers can be included for a particular Role
entity, but should each describe (in a broad sense) the same kind of role.
{
"@context": ["http://schema.org/",
{
"credit": "https://dictionary.casrai.org/Contributor_Roles/",
"cro": "http://purl.obolibrary.org/obo/CRO_0000068"
}
],
"@graph": [
{
"@id": "patients_report.pdf",
"@type": "CreativeWork",
"name": "Report and diagrams of patient admissions",
"author": {"@id": "https://orcid.org/0000-0002-1825-0097"},
"contributor": [
{"@id": "#af1bf5db-96f7-4143-b420-41b7ca1a4052"},
{"@id": "#b3b04f6c-526d-41c3-a9e0-ded8bb1bbfc9"},
{"@id": "#bf768c8f-acdc-448d-9a17-76eb19bc6caa"}
]
},
{
"@id": "https://orcid.org/0000-0002-1825-0097",
"@type": "Person",
"name": "Josiah Carberry"
},
{
"@id": "https://orcid.org/0000-0000-1234-5678",
"@type": "Person",
"name": "Alice W Land"
},
{
"@id": "#af1bf5db-96f7-4143-b420-41b7ca1a4052",
"@type": "Role",
"contributor": "https://orcid.org/0000-0002-1825-0097",
"roleName": [
"original draft preparation",
{"@id": "credit:Writing_original_draft"},
{"@id": "obo:CRO_0000088"}
]
},
{
"@id": "#b3b04f6c-526d-41c3-a9e0-ded8bb1bbfc9",
"@type": "Role",
"contributor": "https://orcid.org/0000-0000-1234-5678",
"roleName": [
"making figures",
{"@id": "obo:CRO_0000003"},
{"@id": "credit:Visualization"}
]
},
{
"@id": "#bf768c8f-acdc-448d-9a17-76eb19bc6caa",
"@type": "Role",
"contributor": "https://orcid.org/0000-0000-1234-5678",
"roleName": [
"data collection",
{"@id": "obo:CRO_0000036"},
{"@id": "credit:Investigation"}
]
}
]
}
In the example above, we see Josiah (ORCID 0000-0002-1825-0097
) have a role, writing the original draft (also shown directly as an author
).
There are two more contributor roles, both held by Alice (ORCID 0000-0000-1234-5678
):
Visualization
.Investigation
.TODO: Do we really want to support both? CRO might be better as it still is mappable to the more shorter/readable credit.
Ad-hoc roles can be provided textually for more specific roles, which may not be consider academic but have nevertheless contributed:
{
"@id": "#f1c16a15-4d9c-4546-b1f1-483e4f899bfc",
"@type": "Role",
"contributor": "https://orcid.org/0000-0000-1234-5678",
"roleName": "quadcopter drone pilot"
},
If the contributor role of a person is unknown, then the contributor
property from a CreativeWork
should link directly to the Person
instead of an intermediary Role
.
I've raised https://gitlab.com/JATS4R/credit-taxonomy/-/issues/8 with the CReDiT people, I think their URLs used to work two years ago.
It may be important to highlight the roles of individuals within organizations they are affiliated with. Consider for instance a report published by a Director compared to another from a summer Intern. Declaring membership in other organizations can also be important for being open about potential Conflic of Interest situations.
For this RO-Crate recommends using an intermediary OrganizationalRole contextual entity at the memberOf
from a person. It is RECOMMENDED to represent the direct affiliation
to the main organization/employer in parallel:
{
"@context": "http://schema.org/",
"@graph": [
{
"@id": "https://orcid.org/0000-0002-1825-0097",
"@type": "Person",
"name": "Josiah Carberry",
"affiliation": "#brownUniversity",
"memberOf": [
{"@id": "#6adc2ffa-3260-4642-9408-609100a1b7c6"},
{"@id": "#c4676ff7-dd65-41c4-a4f9-43784e69933c"},
{"@id": "#0c14fb64-197b-4c46-ab6e-86cd3d86f01e"}
]
},
{
"@id": "#brownUniversity",
"@type": "Organization",
"name": "Brown University"
},
{
"@id": "#bigPharma",
"@type": "Organization",
"name": "Big Pharma Ltd."
},
{
"@id": "#6adc2ffa-3260-4642-9408-609100a1b7c6",
"@type": "OrganizationRole",
"memberOf": "#brownUniversity",
"roleName": "Professor",
"startDate": "1929",
"url": "https://library.brown.edu/info/hay/carberry/"
},
{
"@id": "#c4676ff7-dd65-41c4-a4f9-43784e69933c",
"@type": "OrganizationRole",
"memberOf": "#brownUniversity",
"roleName": "President of Josiah S Carberry Fund",
"startDate": "1955-05-13"
},
{
"@id": "#0c14fb64-197b-4c46-ab6e-86cd3d86f01e",
"@type": "OrganizationRole",
"memberOf": "#bigPharma",
"roleName": "Board Member"
}
]
}
In this example we see that Josiah has two roles for his main affiliation at Brown University. In addition Josiah have declared being a Board Member of a commercial organization.
On the Role contextual entity the memberOf
link goes on to the actual organization the person is a member of. startDate and endDate may be added to specify historical roles and positions that are relevant to declare, and url can provide a link documenting that particular engagement.
I think we should restrict where you may expect a Role
- by Schema.org they can appear almost anywhere, which is not so helpful for developers.
Be aware that schema.org also allows using Role type as a statement on a property, see http://blog.schema.org/2014/06/introducing-role.html I would avoid that usage (see Stian's comment about helpfulness for developers) but thought is was better to be aware of.
On Thu, Jun 4, 2020 at 3:39 PM Stian Soiland-Reyes notifications@github.com wrote:
I think we should restrict where you may expect a Role - by Schema.org they can appear almost anywhere, which is not so helpful for developers.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ResearchObject/ro-crate/issues/79#issuecomment-638853707, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABJNANCII6JOV2KLDPZRNU3RU6PYPANCNFSM4NJ4QW5A .
A few comments as the developer of Describo (these comments are biased from working to implement this linked data structure into an easy to use GUI tool):
I also agree with Stian's comment about limiting where a Role should be expected.
In our OCFL / RO-crate POC for PARADISEC we implemented the role as follows:
"contributor": [
{
"@id": "#1",
"@type": "Role",
"name": "collector",
"contributor": { "@id": "#2 }
},
{
"@id": "#2",
"@type": "Person",
"name": "..."
},
We followed the blog post Stian referenced but we used name
instead of roleName
in the POC though this is a trivial update on our part and I think roleName makes the value of the property clearer in that context.
Note also that in our case we use a text value without references to external ontologies. Not only is this easy to implement in code but it also makes sense from the perspective of the users for whom that role means something. Mapping our roles to an external ontology would make the data more academically rigorous at the expense of added complexity for my users.
I wanted to point that out as our initial testers of Describo are already indicating that the tool is easy to use but they're misusing it in places because they don't know about the underlying spec or graph. It's almost a double edged sword - the tool is useable by novices but it needs much more code to ensure they ultimately create a sensible crate without needing to know about the spec; which they probably won't read. I hope that makes sense.
So, the more complex this spec becomes the harder it will be to keep the tool easy to use unless the implementation is as Stian notes: A role could be a reference to something external or it could be a simple text value. Both are ok.
Thanks, @marcolarosa - I agree and I think a tool like Describo probably have to be even more prescriptive than the specifications to lead people on the right path - the multi-layer profiles can help with that.
It would have been good to have a different property for the controlled vocabulary instead of overloading roleName
with a mix of strings and URIs which would also become difficult to render and order in UIs.
I think the loose additionalType
is well suited for that. - not identifier
and so on, because a Role
object is representing that Something took up SomeRole (at SomeTime) - and multiple times can many take up the same type of role, which would be new Role
instances.
Schema.org has no vocabulary for organizing hierarchies of Role, so it would be wrong to have say a generic Creator
instance of Role
- rather that is a particular subtype of Role.
To allow linking to various controlled vocabularies that know nothing of schema.org/Role, and indeed where those identifiers might be described as properties rather than classes, then using the loose additionalType
makes (to me) more sense although http://schema.org/roleName do formally permit URL
. We can then link this to #71 although because of roleName
having text that would only be needed if the hierarchy of roles was important or pre-existing.
additionalType seems a good option. It would be nice if it accepted DefinedTerm as range rather than only URL.
Community meeting notes show that this is delayed until after 1.1 release. @ptsefton noted on July 2020 call that person or organization role profiles will need concrete use cases and could be widely interpreted across domains. In the meantime, work here could include listing use cases or examples of person roles or organization roles. @stain has sample encoding above using CReDIT and Schema.org, but we could continue to formalize as the next step.
Based on discussion at today's meeting.
I reported that I've been reworking the paradisec to ro-crate export and as part of that I've modeled a person's role as a link from a role
property on the person:
{
@id: ...,
@type: 'Person',
name: ...,
role: {
@id: ....,
@type: 'Role',
name: 'performer'
}
}
And each person is listed as a contributor
as people in the PARADISEC case are contributors to the data who has a specific role.
@marco - how does that role link to the Dataset or File in question can you show a complete example with the contributor etc?
Here's an example for a PARADISEC item - it's abridged and shown before flattening but you'll get the gist:
{
@id: './',
@type: [ 'Dataset', 'RepositoryItem' ],
name: '....',
contributor: [
{ @id: '...', @type: 'Person', name: 'Marco', role: [
{ @id: '#collector', @type: 'Role', name: 'collector' },
{ @id: '#operator', @type: 'Role', name: 'operator' }
]},
{ @id: '...', @type: 'Person', name: 'Peter', role: [
{ @id: '#operator', @type: 'Role', name: 'operator' }
]},
{ @id: '...', @type: 'Person', name: 'Nick', role: [
{ @id: '#collector', @type: 'Role', name: 'collector' },
{ @id: '#performer', @type: 'Role', name: 'performer' }
]},
]
}
This is my last contribution to this thread as I think enough words have been spilled on the matter... :-)
@ptsefton and I have had numerous conversations on this matter and it seems to come down to an ability to model roles in simple
crates like PARADISEC vs more detailed ones.
By simple I mean that a person is encoded at the crate level where they have multiple roles in relation to the whole
crate (as per the structure in the comment https://github.com/ResearchObject/ro-crate/issues/79#issuecomment-808983453).
We are not
encoding person A with role B on file C vs person A with role X on file Y. In this case I appreciate that this way of modelling will result in multiple instances of that person within the crate. And I appreciate that ro-crate should recommend a different way of modelling for that use case.
My request to the community is to support both styles. For simple cases like mine allow an implementer to add a role property to the person which encompasses all of their roles in that crate as a whole. For the more complex crates then an implementer should do it {the agreed upon way} so that the crate does not end up with multiple copies of the same person.
As a type of user, I want some goal so that some reason. As the developer for the PARADISEC project (languages) I want to be able to encode the role a person had in relation to the data described in the crate. Specifically, I want to use my own controlled vocabulary of roles and my roles apply to the crate as a whole, not to pieces of it.