bio-tools / biotoolsRegistry

biotoolsregistry : discovery portal for bioinformatics
GNU General Public License v3.0
70 stars 20 forks source link

Labelling Tool Cards based on broken homepageURL #207

Closed joncison closed 1 month ago

joncison commented 7 years ago

As agreed, need to:

  1. systematically identify all broken homepageURL links (see https://github.com/bio-tools/biotoolsregistry/issues/88) - presumably this is available in the internal QC report?
  2. repair broken links
  3. or if the website cannot be found, delete the entry (if owned by bio.tools admin) or contact the owner (otherwise)
joncison commented 7 years ago

I suggest a coordinated action with @Ahto123 and Tomas.

hansioan commented 6 years ago

The approach we take on homepages is that we will periodically query homepages and annotate homepage status thus:

An automatic process should check the URL (for code 200 OK) and update the status as follows:

  1. query returns 200 OK --> change status to 1
  2. query does not return 200 OK --> if homepage status is 1 in the DB, change it to 0 --> if homepage status is 2, do nothing

When changing a status after a query, include the timestamp at which the query was done. This is useful in case the homepage status is 0 and a future query still returns status of 0, retain the oldest timestamp (i.e. when the homepage was first seen as down)

Perhaps it's good to also keep data on a "latest time" when a homepage was queried, so 2 fields:

hansioan commented 6 years ago

A note on this are url redirects: For example if the tool has the homepage http://phylobench.vital-it.ch/raxml-bb/ it actually redirects to https://embnet.vital-it.ch/raxml-bb/ , but the HTTP status is 301 Moved Permanently. The second url is fine in this case. On the other hand the tool https://bio.tools/muplex has the homepage http://apps.diatomsoftware.com/muplex/html/MuPlex.html which redirects to https://www.hugedomains.com/domain_profile.cfm?d=diatomsoftware&e=com with a status code of 302 Found. This means that "correct" redirects to good homepages will be seen as down. If we do decide to accept redirects it could be the case that we get bad redirects.

joncison commented 6 years ago

Recommend to:

hansioan commented 6 years ago

Also an example of a retired tool is https://bio.tools/kindock for which the homepage status is 200, but it just contains a message saying that the tool is obsolete.

redmitry commented 6 years ago

In OpenEBench we keep track of metrics changes, so one would be able to see historical metrics change for the tool: https://openebench.bsc.es/monitor/metrics/log/{ID}/{path} where {ID} is our ID taken from bio.tools in a way : {id}/{type}/{authority} id = bio.toos:toolID - bio.tools toolID type = shortNames form bio.tools toolType ("cmd", "web", "rest", "soap" ...) authority = the institution that owns the service. Here we use a hostname from bio.tools homepage path = json pointer to the property (for instance: project/website/operational) example: https://openebench.bsc.es/monitor/metrics/log/bio.tools:pmut/cmd/mmb.irbbarcelona.org/project/website/operational note that the current state (checked periodically) may be obtained directly: https://openebench.bsc.es/monitor/metrics/bio.tools:pmut/cmd/mmb.irbbarcelona.org/project/website/operational

Cheers,

Dmitry

[edited 23 Nov 17]

redmitry commented 6 years ago

Hello, We are working on the benchmarking and just changed the url: https://openebench.bsc.es/monitor/metrics/bio.tools:pmut/cmd/mmb.irbbarcelona.org/project/website/operational https://openebench.bsc.es/monitor/metrics/log/bio.tools:pmut/cmd/mmb.irbbarcelona.org/project/website/operational

This supposed to be final.

Cheers,

Dmitry

joncison commented 6 years ago

Support for tool status in the backend is there, with the automatic labelling. What's required is to update the database automatically when a broken link is detected (in future by biotoolsLint, see bio-tools/biotoolsLint#9).

See:

230

88

hansioan commented 5 years ago

Perhaps a good solution is a combined approach of checking / updating homepage/link status once a day and checking / rendering homepages based at tool load as well.

joncison commented 5 years ago

"link" meaning any URL, not just homepage URL

hansioan commented 5 years ago

Perhaps once can use https://updown.io/api