bio-tools / biotoolsRegistry

biotoolsregistry : discovery portal for bioinformatics
GNU General Public License v3.0
69 stars 20 forks source link

Links to Debian packages #315

Open matuskalas opened 6 years ago

matuskalas commented 6 years ago

We would like to finally have more links to source packages in Debian visible in Bio.Tools.

There are two options of who would add them en masse:

Most important is the link to the source package information page, in form of https://tracker.debian.org/pkg/{src_pkg_name}, e.g. https://tracker.debian.org/pkg/bowtie

Links to binary packages can be found further from the source package info site mentioned above. There are multiple binary packages for a given source package, and differ between Debian versions. All the links to these are listed.

joncison commented 6 years ago

Cool ... if you can provide a csv file:

biotoolsID,link_to_Debian

then @hansioan I'm sure can (ask someone if necessary) to do this (link to source packages) rather fast.

Is there such a mapping file?

matuskalas commented 6 years ago

This is super awesome, thx in advance @joncison & @hansioan !!

@smoe is currently generating the TSV, based on his & @tillea's work on adding Bio.Tools references to Debian. Stay tuned...

joncison commented 6 years ago

If it turns out we haven't got complete coverage of Debian Med packages in bio.tools, pls. provide a list of tools and whatever metadata you have - we'll create basic entries for these.

joncison commented 6 years ago

@matuskalas - any word on the TSV? then we can make a move on this. Thanks !

smoe commented 6 years ago

Hi Jon,

That is fantastic to hear.

That TSV is a large table generated from the "ultimate debian database" of packages maintained by the Debian Med folks. That is basically changing everyday when a new version of one of the tools is uploaded. The script was crafted by @tillea when we met in Lyngby. I have now found only a copy that @matuskalas and I apparently saved on on https://github.com/bio-tools/biotoolsConnect/blob/master/DebianMed/edam.sh - I will make sure this script gets a reference to its original location once that resurfaces.

Is there a chance you run that script on your end? If possible then you could always have the very latest and shiniest "dump" without involving (i.e. waiting for) anyone. Please contact me directly to get you started if you run into problems and I can also send you the file as created on my side, just as a start.

My understanding is that you somehow want to automate the merging of what Debian provides with the bio.tools entries. You see the packages that have references to bio.tools already. These should be straight-forward to start with. I personally hope that there is also a reference back to Debian on the bio.tools side. For the packages with no current bio.tools reference I am not so sure. For instance, you have https://bio.tools/ImageJ_2.0.0 when Debian's package is just called - well - imagej. I think you need some kind of policy to consistently decide if a new imagej entry should be created or if that existing one should receive just a ref to Debian. Difficult are also those situations in which bio.tools describes a web service of a tool but is silent about the binary.

Cheers,

Steffen

tillea commented 6 years ago

I think the latest version of the script is here: https://salsa.debian.org/blends-team/website/blob/master/misc/sql/edam.sh I'd like to stress again as I always did: I never ever intended to craft a tool (as Matus is continuously calling it). Its just a primitive demo how to drain relevant information from UDD as everybody can do and which is probably useful for biotools. My hope was that some biotools developer would take over this idea. Kind regards, Andreas.

joncison commented 6 years ago

Thanks @smoe and @tillea for the links and info. We'll take a look at automating the addition of links (and whatever other metadata we can get) to the bio.tools entries, thus properly referencing Debian. We can ensure coverage of tools in Debian (easy) as we go.

Other issues (e.g. around tool versions, services over tools etc.) are a bit trickier, but it's on our radar. cc @matuskalas FHI.

hmenager commented 5 years ago

During the last debian med sprint, we continued working on this with @smoe , @matuskalas and some help from @tillea and @mr-c.

We have made some real progress there, and @ValentinMarcon (cc-ed) is currently working to improve and clean up this work. We already have an existing mapping for in a YAML file we published to github:

https://github.com/bio-tools/debian-med-links-analysis/blob/master/Mapping_db_bt.yaml

It contains debian links for 235 bio.tools entries, ready to include! Our point here, as discussed with the debian med community, is that it would be nice for these entries to be linked to debian med packages pages, with a small debian-styled logo at the bottom of the tool card, e.g. a link to:

https://packages.debian.org/search?keywords=abyss&searchon=names&exact=1&suite=all&section=all

hmenager commented 5 years ago

(also ping @joncison and @hansioan)

joncison commented 5 years ago

thanks for the work here guys ... a lot of holidays and work travels coming up, but me and @hansioan will pick up on this late April

joncison commented 5 years ago

@smoe cc @matuskalas just pasting an (edited version of the) msg. of yours Steffen which I missed at the time (sorry!) so things don't get lost

"I very much agree we need something to cross-feed what we have. For the moment I tend to think that without extra value that bio.tools brings for the Debian community there will not be much incentive for my local peers to contribute anything that cannot be automated ..."

"The sharing between distributions (conda/debian-> bio.tools AND BACK!) would possibly see things starting. The mere package descriptions are rather trivial in comparison with the packaging itself - that will not be it. What comes to mind is the EDAM annotation in its detailed workflow-preparing form that we had seeded in 2014/5 as in https://salsa.debian.org/med-team/bowtie2/blob/master/debian/upstream/edam . We never decided how to make use of that work. I presume all the "workflows with galaxy"-folks have come up with something similar in the meantime. Can we possibly come up with a consensus annotation for that?"

"So, if we as a community could come up with an manually curateable format (I like the YAML we came up with, don't make it XML) to describe also fractions of the package, then I think we are were mutual benefit starts ...."

smoe commented 5 years ago

The Debian world is currently stuck (frozen) because of the emerging release of Buster. Also, I still don't think that Debian has really found its sweet spot yet with the increasing adoption of Conda for about everthing everywhere, but hey, yes, by all means bring pointers to our and Conda's packages along. IIRC we once had plans to meet up about it. DTU or KU is not too much of a problem for me - just tell me. Or come south :)