Closed joncison closed 6 years ago
This is part of the automated QA / QC (https://docs.google.com/document/d/1ATj2zJOlbR3Edk6QyGvPX5HStZBknqfx1Fwqk4k0kqE/edit)
Link checking has been added to routine QA/QC checks; once existing errors are systematically fixed we can revisit the colour-coding of links etc.
We've experienced a bunch of dead homepage links while packaging tools here in Aarhus. I agree that an automated link-checking approach must be implemented and I think that we have the expertise and resources here in Aarhus to do it, since we have three student programmers working on bio.tools and packaging.
I'd like to make the suggestion more concrete:
I suggest the development of a simple service which periodically (e.g. monthly) checks all homepage links in the bio.tools database. IMO the service should be completely separate from the bio.tools web application.
The service will:
Furthermore:
Potentially, the service could also store a timestamp for when the tool was last checked.
Implementing this as a standalone service keeps the bio.tools code clean. It's also an approachable project for a student programmer.
I noticed that the bio.tools proposal hints at a general QA/QC mechanism, or at least a generic way of storing this type of information in the database and showing it to users. It should be no problem integrating with such a mechanism.
What do you think?
We have this automatic checking functionality implemented at OpenEBench (Tools Monitoring section).
See Green/Red dots in https://elixir.bsc.es/elixibilitas/
We can provide the list of dead links and errors obtained as a rest endpoint. Then you can deal with authors in the way you said.
@redmitry can give more details.
This is still ongoing work, the plan is to monitor also "last-seen" time and "max time active".
Hello Dan,
We periodically check (every 6h) all bio.tools tools homepages.
[update: 23 nov 17] https://openebench.bsc.es/monitor/metrics/**bio.tools:{id}/{type}/{host}**/project/website/operational https://openebench.bsc.es/monitor/metrics/bio.tools:3d-fun/web/3dfun.bioinfo.pl/project/website/operational Cheers, Dmitry
@jlgelpi @redmitry That's great, a REST endpoint would be ideal.
@joncison I'd be interested in assigning this to one of our student programmers, but they would need access to the code base and potentially a small introduction to how it's organised (could be given in-person in Paris).
Implementation as proposed, in bio.tools, would be a really nice contribution.
@ekry : please take a look at this thread and arrange code access to Dan and his students, as needed. Dan, your guys will need to coordinate with Emil. This will have to wait till the deliverables are done (Sep on).
bumping priority - it's very closely related to https://github.com/bio-tools/biotoolsregistry/issues/223 and https://github.com/bio-tools/biotoolsregistry/issues/207
This issue was moved to bio-tools/biotoolsLint#12
Federico says ...
"I think there should be some sort of automatic “broken link” detection. In fact, when the registry will be open and available to everybody and the number of registered service will be huge, it will be impossible to manually curate the registry checking that all the tools are still alive. Some sort of color code could help distinguishing links that appear to be broken from just few hours, or days or months… After a service has been unreachable for a pre-defined amount of time its status should change to closed (or something similar). Another interesting function could be to send an e-mail alert to the service contacts (if they want) when the service appears to be down for longer than a pre-determined number of hours."