Closed joncison closed 1 month ago
I suggest a coordinated action with @Ahto123 and Tomas.
The approach we take on homepages is that we will periodically query homepages and annotate homepage status thus:
An automatic process should check the URL (for code 200 OK) and update the status as follows:
When changing a status after a query, include the timestamp at which the query was done. This is useful in case the homepage status is 0 and a future query still returns status of 0, retain the oldest timestamp (i.e. when the homepage was first seen as down)
Perhaps it's good to also keep data on a "latest time" when a homepage was queried, so 2 fields:
A note on this are url redirects:
For example if the tool has the homepage http://phylobench.vital-it.ch/raxml-bb/ it actually redirects to https://embnet.vital-it.ch/raxml-bb/ , but the HTTP status is 301 Moved Permanently
. The second url is fine in this case.
On the other hand the tool https://bio.tools/muplex has the homepage http://apps.diatomsoftware.com/muplex/html/MuPlex.html which redirects to https://www.hugedomains.com/domain_profile.cfm?d=diatomsoftware&e=com with a status code of 302 Found
.
This means that "correct" redirects to good homepages will be seen as down. If we do decide to accept redirects it could be the case that we get bad redirects.
Recommend to:
301 Moved Permanently
exactly like 200 OK
.302 Found
in same way as other problematic / unresolvable (i.e. not 301 or 200) URLsAlso an example of a retired tool is https://bio.tools/kindock for which the homepage status is 200, but it just contains a message saying that the tool is obsolete.
In OpenEBench we keep track of metrics changes, so one would be able to see historical metrics change for the tool: https://openebench.bsc.es/monitor/metrics/log/{ID}/{path} where {ID} is our ID taken from bio.tools in a way : {id}/{type}/{authority} id = bio.toos:toolID - bio.tools toolID type = shortNames form bio.tools toolType ("cmd", "web", "rest", "soap" ...) authority = the institution that owns the service. Here we use a hostname from bio.tools homepage path = json pointer to the property (for instance: project/website/operational) example: https://openebench.bsc.es/monitor/metrics/log/bio.tools:pmut/cmd/mmb.irbbarcelona.org/project/website/operational note that the current state (checked periodically) may be obtained directly: https://openebench.bsc.es/monitor/metrics/bio.tools:pmut/cmd/mmb.irbbarcelona.org/project/website/operational
Cheers,
Dmitry
[edited 23 Nov 17]
Hello, We are working on the benchmarking and just changed the url: https://openebench.bsc.es/monitor/metrics/bio.tools:pmut/cmd/mmb.irbbarcelona.org/project/website/operational https://openebench.bsc.es/monitor/metrics/log/bio.tools:pmut/cmd/mmb.irbbarcelona.org/project/website/operational
This supposed to be final.
Cheers,
Dmitry
Support for tool status in the backend is there, with the automatic labelling. What's required is to update the database automatically when a broken link is detected (in future by biotoolsLint, see bio-tools/biotoolsLint#9).
See:
Perhaps a good solution is a combined approach of checking / updating homepage/link status once a day and checking / rendering homepages based at tool load as well.
"link" meaning any URL, not just homepage URL
Perhaps once can use https://updown.io/api
As agreed, need to: