geneontology / go-site

A collection of metadata, tools, and files associated with the Gene Ontology public web presence.
http://geneontology.org
BSD 3-Clause "New" or "Revised" License
46 stars 89 forks source link

Use a URL in the Noctua contributor field #1615

Closed suzialeksander closed 1 year ago

suzialeksander commented 3 years ago

Currently, all pathways imported from Yeast Pathways has a unique contributor, e.g.

http://noctua-dev.berkeleybop.org/editor/graph/gomodel:YEAST-DE-NOVO-PYRMID-DNT?

lists contributor https://pathway.yeastgenome.org/YEAST/NEW-IMAGE?object=YEAST-DE-NOVO-PYRMID-DNT

We'd like to simplify this and truncate the URL, so every model would have contributor https://pathway.yeastgenome.org

This would need to be added to users.yaml, and some considerations from @dustine32:

We should also talk about this on either the next Alliance Pathways or pathways2go call. Whichever way we decide to handle it could result in changes to either the model conversion code, Noctua landing page/form/editor, Minerva, some other software that consumes/produces these models.

cmungall commented 3 years ago

How about just giving us the orcids of all pathway curators at SGD and we can attribute all to all?

Ideally the ShEx specification would be able to answer these kinds of questions authoritatively. Unfortunately it is underspecified right now

<GoCamModel> {
  a [owl:Ontology] + ;
  contributor: xsd:string +; #TODO would be better as an IRI
  date: xsd:string {1}; #TODO can we make this an xsd:date?
  provided_by: xsd:string +; #TODO would be better as an IRI
  rdfs:comment xsd:string *;
  modelstate: xsd:string {1}; #TODO would be better as an IRI
  in_taxon: . *;
  title: xsd:string {1};
  imports: . *;
  oboinowlid: . *; #TODO not sure if we really want this?
  owl:versionIRI . *;
}

regardless of whether contributor is modeled as a string or URI (I agree it should be URI, but we should have a ticket to change this if we need to), this is our central specification and it should give more guidance on how these fields are to be filled in.

I believe shex has a pattern construct so we can place a regex is here to constrain this, e.g. to http://orcid.org/... if that is our intent.

I would write the documentation for contributor something like this:

The value of the contributor field MUST be string literal. The value MUST be a URL that indicates the specific individual that provided some contribution to the model. Refer to dublincore for details. The value SHOULD be an ORCID, but in cases where an ORCID is unavailable (e.g the curation was done in a system outside Noctua) then other URLs indicating the individual MAY be used. The contributor SHOULD always denote an individual curator but in cases where we cannot retrieve a single individual (e.g. performed outside Noctua and done by a group) another URL, e.g. https://pathway.yeastgenome.org may be used to denote a collection of individuals. Note that the provided_by field should be used to indicate groups of people.

Note the use of ISO specification language.

We need similar docs for provided_by

We do need a way to represent the full provenance, that this came from https://pathway.yeastgenome.org/YEAST/NEW-IMAGE?object=YEAST-DE-NOVO-PYRMID-DNT - how are we doing this for Reactome @dustine32 ?

dustine32 commented 3 years ago

@cmungall Reactome is doing the same as YeastCyc:

https://reactome.org/content/detail/R-HSA-983695

YeastCyc contributor is actually based on Reactome already using this pattern.

vanaukenk commented 3 years ago

@ukemi correct me if I'm wrong, but my understanding of why the Reactome URL was used in the contributor field is that the link would actually take people to a Reactome page that lists the people (curators and researchers) who authored and reviewed the model.

For the MOD imports, we have run into cases where former curators don't have an ORCID and we've instead used the GOC uri in the users.yaml file, for example GOC:cab1.

nickname: 'Carol Bastiani' organization: WB uri: 'GOC:cab1' xref: 'GOC:cab1'

Note that we also have entries currently in the users.yaml to denote groups of curators:

nickname: 'Curators at SGD' organization: SGD uri: 'GOC:sgd_curators' xref: 'GOC:sgd_curators'

We also discussed on today's Alliance call using a new metadata tag for GO-CAMs from other pathway resources to indicate the source model so that the contributor field would always correspond to an individual or group of individuals from the users.yaml file.

@lpalbou was this proposed tag, in fact, 'source' or something else? https://www.dublincore.org/specifications/dublin-core/dcmi-terms/elements11/source/

It'd be good to get all this nailed down as there are implications for Noctua as well, e.g. the Contributor field on the landing page currently lists users with ORCIDs but that will exclude some contributors included in the soon-to-be imported MOD data.

suzialeksander commented 1 year ago

contributor has been set to GOC:sgd_curators from other tickets