FAIR-MI / miiid-schema

A metadata schema for the Minimum Information about Intermicrobial Interaction Data (MIIID) using LinkML
https://fair-mi.github.io/miiid-schema/
GNU General Public License v3.0
3 stars 0 forks source link

Intermicrobial interaction modelling decisions #1

Open cpauvert opened 1 year ago

cpauvert commented 1 year ago

This is a rolling issue of pros and cons of modelling decisions especially regarding how to model relationships (see docs) as well as assumptions taken during the development of the MIIID metadata schema.

0. Model all properties from the Perspective paper as strings

Pros:

Cons:

Status: not considered

1. microbial Participant as a separate class and participants is a slot accepting multiple Participant

Pros:

Cons:

Status: Tried as a first approach. Superseded by (3) Commit: 9a2016ae8c62734dd91dfc66889164812b8a6617

2. Model interaction using the biolink

Pros:

Cons:

Status: not considered yet because of complexity

3. participants is a slot accepting multivalued names, tax_id

Pros:

Cons:

Status: considered implemented Commit: 10fadb67642a8d4a13ed9619fc2b42945ce9d98c

cpauvert commented 1 year ago

Regarding tax_id. it is defined as integer at the moment but correspond to NCBI tax_id. Should I use https://biolink.github.io/biolink-model/docs/OrganismTaxon.html or the ontology NCBITaxon? Could be related to https://github.com/linkml/linkml/issues/1112

Tried first as an ontology following https://linkml.io/linkml/schemas/enums.html#dynamic-enums. See e914dd47941e4f2d62f764141fab7fc76f181a8f in branch taxid-as-ontology but not sure how to implement a query to the ontology. But the tooling has yet to come (blog)

Tried to encode tax_id as a type, especially because there is also a Wikidata property that I could map to. See 068c60fbe4ab961f10f7841195dea57cf3aa45f5 in branch taxid-as-type. But still unsure whether the user should (A) input an integer that the model know it is an NCBI TaxID or (B) prefix the id with the correct namespace NCBITax:2. I'm puzzled especially because the conversion to ttl then does not expand the NCBI prefix for instance.

cpauvert commented 1 year ago

Further modelling and technical questions (no time to be presented at NMDC x NFDI4Microbiota meeting of 2023-07-25):

@cmungall as you were interested to have a closer look! Suggestions/Feedback would be much appreciated!

A. Modelling

A1 How to encode drop-out experiments with a focal strain?

A2 What is the cardinality of multi-method paper?

B. Technical questions

B1 How to handle missing data (INSDC)? See #4

B2 How to constraint values to an ontology (via the dynamic enums)?

B3 How to describe a slot with the human-readable ontology term and the machine-readable ontology number?