bio-tools / biotoolsRegistry

biotoolsregistry : discovery portal for bioinformatics
GNU General Public License v3.0
69 stars 18 forks source link

Dodgy "Library" tool type annotations #464

Open joncison opened 4 years ago

joncison commented 4 years ago

A quick look at these: https://bio.tools/t?page=2&toolType=%27Library%27&sort=score

shows that quite a lot of the "Library" annotations look wrong (to me at least). The main offender is R packages labelled as "Library" when they're just packages for the R environment.

The definition of "Library" currently is A collection of components that are used to construct other tools. bio.tools scope includes component libraries performing high-level bioinformatics functions but excludes lower-level programming libraries.

cc @hansioan

hansioan commented 4 years ago

@joncison R packages are indeed Libraries , if the Library definition does not cover them, then we need to change the definition. In my view Library tool type refers to R packages, modules and other sort of bioinformatics packages/libraries/modules used in programming languages

joncison commented 4 years ago

It's a tricky one I think, and shows some of the limits of the way we handle tool types.

Although the R packages are libraries in the sense they bundle functions and maybe data (for the R environment), a lot of them (at least most of the ones I've seen in bio.tools) aren't actually collections of functions. They look more like simple command-line tools with a single function. So to describe them as "Library" is weird. It would be a bit like describing a tool in the EMBOSS suite as a library.

Hmm, not sure.

We could introduce the concept of "R package" as a tool type? That would be more obvious / usable e.g. in search?

In the longer term (as @matuskalas has consistently suggested) a proper modelling of collections in *(bio.tools) and perhaps software interface.

hansioan commented 4 years ago

I think an R package is more of a Library than a command-line tool anyway. You don't really use an R package in the command-line, you use them in your R code.

I don't really see the usage of Library outside of R packages and other languages packages, modules etc. Adding the tool type of "R package" seems a bit too targeted and if we start doing that, then we might end up with many more other. We already have tool type of "SPARQL endpoint" with which I disagree on the same basis and which only covers 29 tools. On the other hand the good thing about targeted "tags" in general, tool types or other annotations is that they help with the search.

Collections can certainly be improved, they are just tags at this moment that anyone can assign, I think perhaps subdomains might be what will replace them.

I remember there used to be a software interface along with something else which got merged into tool type.