galaxyproject / galaxy_codex

Galaxy Communities Dock aka Galaxy Codex: catalog of Galaxy resources (tools, training, workflows)
https://galaxyproject.github.io/galaxy_codex/
MIT License
8 stars 15 forks source link

EDAM annotation in XML tool definition vs bio.tools #228

Open kostrykin opened 2 weeks ago

kostrykin commented 2 weeks ago

I had a discussion with Bérénice yesterday and it was suggested that I create an issue here to further discuss things.

AFAIK the Codex uses the EDAM annotations from bio.tools and ignores those in the XML tool definitions. However, there are numerous Galaxy tools in our community which wrap around a single tool in terms of bio.tools, e.g., scikit-image, which is a very generic library for image processing, image analysis, and visualization.

For those Galaxy tools, there are, for example, Filter 2-D image and Compute Voronoi tessellation. Both wrap around scikit-image, both inherit the EDAM annotations from scikit-image, right? However, Filter 2-D image is a tool for image processing and certainly not analysis, while Compute Voronoi tessellation is rather associated with image analysis than processing.

The question now is, how can we establish a finer level of granularity for Galaxy tools? One the one hand, it seems natural to not ignore the EDAM annotations in the Galaxy tool wrappers, but to give them precedence. On the other hand, this might be dangerous because the EDAM annotations in bio.tools are usually more reliable. Maybe, the middle ground here is to give precedence to the EDAM annotations in the Galaxy tool wrappers only for specific tools (those which are too generic).

xref https://github.com/beatrizserrano/BH2024-project17/issues/1

hmenager commented 3 days ago

My two cents here: I would advocate having EDAM annotations in Galaxy Tool XML files override completely the ones in bio.tools, because as you said they should be finer grained. If EDAM annotations in bio.tools are more reliable, this is probably a curation issue in Galaxy tools. Thoughts @matuskalas ?

bgruening commented 3 days ago

I can only add background to the initial idea. We thought that bio.tools is providing us with good default EDAM terms. So whenever we have the bio.tools ID as part of the tool and NO edam ontologies we will retrieve the ontologies from bio.tools. However, if the tool developer is not happy with the bio.tools annotation, for whatever reason, you can add them to the tool as well - without or with bio.tools ID. In that case Galaxy will take the edam terms from the tool and not the ones from bio.tools.