Would an <organisationalConcept> annotation be useful?

edamontology / edamontology

EDAM is a domain ontology of data analysis and data management in bio- and other sciences, and science-based applications. It comprises concepts related to analysis, modelling, optimisation, and data life cycle. Targetting usability by diverse users, EDAM's structure is relatively simple, divided into 4 sections: Topic, Operation, Data, and Format.

https://edamontology.org

Creative Commons Attribution Share Alike 4.0 International

119 stars 58 forks source link

Would an <organisationalConcept> annotation be useful? #265

Open joncison opened 7 years ago

joncison commented 7 years ago

We used this in EFO to indicate concepts which were mostly for organisational purposes (structuring the tree) and were not really intended for annotation. There are lots of examples of these in EDAM (including all terms at the topic level).

Should we do it for EDAM? I imagine the value could be in two parts:

to see what concepts we don't think are very useful for annotation (and so, do we really need them?)
having bio.tools use the annotation, e.g. as a hint for the end-user when picking terms

Thoughts welcome ...

veitveit commented 7 years ago

I see the following advantages to use "organisational" concepts:

Less redundant annotations in bio.tools
Better selection of correct terms from automated annotation (e.g. edamMap)
Annotations by "organisational" terms can be used when no matching term is available. And then we can identify e.g. operations that need to be added.

2017-04-30 10:16 GMT+02:00 Jon Ison notifications@github.com:

We used this in EFO to indicate concepts which were mostly for organisational purposes (structuring the tree) and were not really intended for annotation. There are lots of examples of these in EDAM (including all terms at the topic level).

Should we do it for EDAM? I imagine the value could be in two parts:

to see what concepts we don't think are very useful for annotation (and so, do we really need them?)

having bio.tools use the annotation, e.g. as a hint for the end-user when picking terms

Thoughts welcome ...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/edamontology/edamontology/issues/265, or mute the thread https://github.com/notifications/unsubscribe-auth/APEZhfkLtmtmkS2E9dvN1YGwQnVk27grks5r1ENWgaJpZM4NMgSk .

-- |||/ (o o) ----ooO-(_)-Ooo----

Don't worry about life; you're not going to survive it anyway.

http://computproteomics.bmb.sdu.dk

joncison commented 7 years ago

It would be trivial to support the annotation and then "just" a (huge) manual task to make the annotations. Before moving on this, I'd like to hear what @matuskalas and @hmenager think.

matuskalas commented 7 years ago

This we have discussed many times in the history of EDAM development, and we always concluded not to do it. It is both a bad idea from the (onto)logical point of view, and a completely unnecessary nonsense, too. That is, it would add work to us maintainers, and add noise to end users.

Applications such as edamMap should rather rely on more generic and thoughtful strategies, such as matching as specific concept as possible while staying truthful. The same with human annotators.

Note: The whole EDAM is "orginisational", that's exactly the point of a domain ontology! ;-)

matuskalas commented 7 years ago

The question stays, what to do with the Format (by type) and Identifier (by type) concepts. They are (onto)logically poorly relevant, and USUALLY {see Note} not useful for annotation. On the other hand, they are useful in ontology browsers like Protégé, BioPortal, and OLS, that can't navigate very nicely by is identifier of / is format of. However, of how high importance the usability of such navigation is, is another question.

Note: Concept like Sequence identifier can be relevant in annotation of a data input field or a data model/schema field.

joncison commented 7 years ago

Not just those two, but also many (but not all) of the 2nd level terms, e.g. in Operation branch concepts such as Comparison, Conversion etc. ; and also (and here are the real devils that need to be rooted out) a few (not many) concepts deeper than that. Perhaps a different approach is to clean the "deeper" cases up up first ? (Operation and Data mostly, Format and Topic are basically OK).

joncison commented 7 years ago

For now, I'm replacing comments of the following type: <rdfs:comment>This is a broad concept and is used a placeholder for other, more specific concepts.</rdfs:comment>

with a new uiTip annotation <uiTip>Not recommended for annotation in bio.tools.</uiTip>

I'll then systematically go through the top 2 levels of each branch, say.

joncison commented 7 years ago

uiTip has been added to:

all top-level concepts
nearly all 2nd tier operations
Format (by type of data) (http://edamontology.org/format_2350) and all its kids
a few others places (look you'll see)

no doubt more could be done - and I think any new cases may be candidates for deprecation (we don't want too many organisational classes)

cc @veitveit @hmenager @ FTI

matuskalas commented 7 years ago

If at all, then this should be done in an application-unspecific manner (i.e. not Bio.Tools-specific).

Tooltip is again a very different thing, and all reasonable applications using EDAM should construct their tooltips from the generally valid attributes of an EDAM concept, most importantly definition and comment-s.

Reopening as this is obviously a work in progress.

joncison commented 7 years ago

I'm open to suggestions on renaming the uiTip element and the contained message if you think that would be worthwhile.

A 2nd pass through - to find all "organisation classes" and either annotate them or (if possible & desirable) deprecate them, would be beneficial (I think I got most of them, but ...) I'm doing that for Operation branch a bit, at least - and it's looking better for it.

... but I think this definitely can help e.g. https://github.com/edamontology/edammap and the new bio.tools Tool Annotator.

joncison commented 7 years ago

@matuskalas : I'd like to close this issue, so wait for you to say what needs to be done. Thanks!

matuskalas commented 7 years ago

This has many different aspects, and thus needs some serious considerations and good decisions case-by-case. I'm convinced that we need a dedicated 1-day hackathon together with involved implementers (EDAMMap, Bio.Tools, BISE/biii.eu, ..., maybe Debian etc.), to take the right decisions. I hope for a CodeFeast :-)

Some notes:

A similar (but slightly different) functionality is required also by the NEUBIAS BISE/biii.eu.
There shouldn't be anything implementation-specific in EDAM (loose coupling, reusablity).
uiTip in this form is unfavoured, we should use the standard oboInOwl:definition and rdfs:comment for visualisation in GUIs.
Only if absolutely necessary, a further granularity of comments may be added, e.g. userTip being a specialisation of rdfs:comment.
organisationalConcept can be disputable as a special notion, because the whole ontology is "organisational". "Navigation only" - as a complement of "for annotation" - sounds better.
N.B. that different implementations need different concepts for navigation (between the implementations), and different for annotation (between the implementations). E.g.:
- registry A needs Analysis for navigation but not annotation
- registry B needs Analysis for navigation AND for annotation
- registry C does not need Analysis for either navigation or annotation
A solution that may use general tooling (required also for BISE) could be using implementation-UNspecific subset-s. This alternative option would instead of "negatively" marking the concepts that aren't for annotation, mark the rest of concepts "positively" as being for annotation. Advantage: An EDAM subset "for annotation" can be generated, with proper subsumption and other relations. One of the options needing a proper analysis with all implementers involved.

Bottom line: This issue needs more future coordination & work, and thus can't be closed as of now.

joncison commented 7 years ago

Great, thanks. So I leave it as-is for now.

joncison commented 6 years ago

I systematically replaced <uiTip> with <usageGuideline> which is better. Issue still open and awaiting a more systematic approach cc @matuskalas @hansioan

joncison commented 6 years ago

UPDATE Once the revision to the Editors Guide (http://edamontologydocs.readthedocs.io/en/latest/editors_guide.html) is done, we'll be able to revisit this, and then I think action the systematic annotation of all placeholders for all subontologies - more formalised notions of "Placeholder" (organisational class, whatever) are dropping out from the Guidelines.

matuskalas commented 6 years ago

Instead of <usageGuideline>Human-readable text</usageGuideline>, I'd prefer seeing <notRecommendedForAnnotation>true</notRecommendedForAnnotation>

joncison commented 5 years ago

@matuskalas I refactored things as per your suggestion above, and retaining <usageGuideline> (it could be useful in future).

Concepts now annotated as <notRecommendedForAnnotation>true</notRecommendedForAnnotation> are the "placeholders" as described here (however, not yet systematically - some placeholders still need to be annotated as such)

joncison commented 5 years ago

Just added <notRecommendedForAnnotation>true</notRecommendedForAnnotation> systematically to Identifier (by type of data), all of it's immediate kids, and some sub-kids as required.