biolink / biolink-model

Schema and generated objects for biolink data model and upper ontology
https://biolink.github.io/biolink-model/
Other
176 stars 72 forks source link

Align and connect 'supporting document' and 'publications' edge properties #1107

Closed mbrush closed 1 year ago

mbrush commented 2 years ago

At present, the most common Biolink edge property used in Attributes to hold supporting publications is publications.

But Biolink also provides a supporting document edge property that would cover non-publication source documents (a request from TMKP where not all mined text is from publications).

At present these edge properties are not hierarchically related in the Biolink Model.

Questions:

  1. Do we want to keep both?
  2. If so, we should make publications a child of supporting document. And name these properties consistently (e.g. change 'publications' -> 'supporting publications')
  3. If not, which one to keep?
  4. If we decide to only keep 'supporting document' do we need a way to indicate that the document is a publication? a. e.g. Attribute.value_type field = biolink:Publication b. e.g. a nested Attribute object keyed on the biolink:type property, with value = biolink:Publication
mbrush commented 2 years ago

Proposal:

Another issue here is to decide if the names of these properties should be plural or singular . . .

sierra-moxon commented 2 years ago

@mbrush - is it necessary to change 'publications' to 'supporting publications'? (totally doable, I just know this is a very widely used parameter and would require a lot of downstream changes. Alternatively, I could add supporting publications as an alias of publications).

sierra-moxon commented 2 years ago

also should (somewhat pedantically) note that supporting document is currently in the model, and is not multivalued. Would the second part of this proposal make supporting documents multivalued?

sierra-moxon commented 2 years ago

should 'supporting text' also become 'supporting texts' and be a child of 'supporting documents'?

sierra-moxon commented 2 years ago

should 'supporting data set' also be a child of 'supporting documents' ?

mbrush commented 2 years ago

My thoughts on your questions @sierra-moxon:

mbrush commented 1 year ago

Re-opening this issue as additional questions regarding the representation of supporting documents have surfaced, and may warrant reconsideration of decisions made above.

Current Biolink representation:


### CLASSES ###

  publication:
    is_a: information content entity
    description: >-
      Any published piece of information. Can refer to a whole publication,
      its encompassing publication (i.e. journal or book) or to a part of a
      publication, if of significant knowledge scope (e.g. a figure, figure
      legend, or section highlighted by NLP). The scope is intended to be
      general and include information published on the web, as well as printed
      materials, either directly or in one of the Publication Biolink
      category subclasses.
    slots:
      - authors
      - pages
      - summary
      - keywords
      - mesh terms
      - xref
      . . . 

### ASSOCIATION SLOTS  ###

 supporting documents:
    is_a: association slot
    description: >-
      One or more referencable documents that report the statement expressed in an Association, or provide 
      information used as evidence supporting this statement.
    range: uriorcurie
    multivalued: true
    examples:
      - value: PMID:12345678

  publications:
    aliases: ["supporting publications"]
    singular_name: publication
    description: >-
      One or more publications that report the statement expressed in an Association, or provide information used as 
      evidence supporting this statement.
    is_a: supporting documents
    multivalued: true
    range: publication

Questions:

  1. Confirm we want two slots to link to supporting docs/pubs.

  2. Revisit idea or renaming to be more consistent.

  3. Broaden supporting documents and/or publications definition to include the idea of documents supporting things besides Statements (e.g. StudyResults), and indicating that provenance information qualifies as supporting info:

Documents that provide information supporting the creation of a Statement or other Information Entity. Most often this slot is used to reference documents that report the Statement expressed in an Association, or provide evidence or provenance information supporting this Statement. Another common use is to reference documents that supported the creation of a Study Result object.

An issue here is that the publications property currently an association slot, as its initial scope is to annotate Edges/Statements. But for linking other types of informational entities to pubs that support them (e.g. Study Results . . . .which are named things, not associations), an association slot is not appropriate (we would need a node property or predicate).

  1. Establish conventions of use for these properties in TRAPI messages (see tickets here and here)
mbrush commented 1 year ago

Voting on decisions that address 1-4 above is happening in Discussion #1234.

mbrush commented 1 year ago

Closing, based on outcomes of voting in #1234 and final decisions documented in the supporting publications specification here.