Big-Bee-Network / bee-interaction-database

Interactions from the literature about bees
2 stars 1 forks source link

suggest fixes for interaction_types_mapping.csv #17

Closed jhpoelen closed 3 years ago

jhpoelen commented 3 years ago

related to https://github.com/Extended-Bee-Network/bee-interaction-database/issues/16 -

roughly two issues:

  1. mapping to an unsupported interaction type (i.e., aggressive behavior / http://purl.obolibrary.org/obo/GO_0002118)
  2. multiple ambiguous mappings define from http://purl.obolibrary.org/obo/RO_0002574 (e.g., https://github.com/Extended-Bee-Network/bee-interaction-database/blob/9d5cebff7f0588de589f0972c24d2561750ef9c2/interaction_types_mapping.csv#L8)

Addressed by (1) mapping to interactsWith and (2) removing the from identifiers, leaving only the labels (e.g., fight over nest of).

Curious to hear your thoughts!

jhpoelen commented 3 years ago

oh and here's the review with the most recent elton:

from https://travis-ci.com/github/jhpoelen/bee-interaction-database/builds/210450756 -

reviewing [local] using Elton version [0.10.7].
updating [local]... done.
creating review [local]... done.
listing interactions [local]... done.
listing nanopubs [local]... done.

review of [jhpoelen/bee-interaction-database] included:
  - 3228 interaction(s)
  - 6 note(s)
  - 3228 info(s)

[jhpoelen/bee-interaction-database] has 6 reviewer note(s):
      5 target taxon name missing
      1 found invalid location: [invalid (latitude, longitude) = (33.997278  ,-119.714056)]
seltmann commented 3 years ago

I am curious what should go into the field provided_interaction_type_id? I was assuming it was an archive of the interaction ID that was provided but not used. Where mapped_to_interaction_type_id would be the interaction ID to use. Thus provided_interaction_type_id is an archive of the interaction ID that was provided by the resource.

jhpoelen commented 3 years ago

@seltmann I agree that the provided id is a useful thing to keep. What I am trying to figure out is how to establish unambiguous mappings between provided and translated terms.

Currently, the design is such that:

  1. the provided id takes precedence over a provided label
  2. no duplicate provided terms are allowed
  3. only GloBI supported terms can be translated into

With 1. , a provided pair (eaten by, example.org/RO_123) would be seen as example.org/RO_123, ignoring the label eaten by. However, when only label is provided in pair (eaten by, [empty]), then GloBI sees only eaten by .

So, if the original text provides identifiers that mean different things depending on their context (e.g., different labels for the same id), then GloBI will complain because of 2. no duplicate provided terms are allowed.

I wonder whether in this example the provided label should be taken as context or can be seen as optional/informational. Curious to hear your thoughts on this.

seltmann commented 3 years ago

@jhpoelen Interesting question. Is the provided_interaction_type_id something to match on or just an interpretation of provided_interaction_type_label if both are present? At least this is how I am interpreting your question.

I would say that both provided_interaction_type_label and provided_interaction_type_id are not necessary. A person could have eaten by only, eaten by and RO_123 or RO_123 only. If a person provides eaten by and RO_123 I would decide to require both eaten by and RO_123. If this is more complicated than necessary (which it might be), I would prioritize the label (eaten by).