WCGA / West-Coast-Ocean-Data-Portal

bugs and fixes for the geoportal back end and UI front end of the WCODP
1 stars 0 forks source link

Need for validator for links extracted from metadata records #45

Open emiliom opened 9 years ago

emiliom commented 9 years ago

Links to resources (web services, documents, files, etc) pointed to in the harvested metadata documents will go stale. That's what links do :smile:. Heck, it's probably not uncommon that some links found within a metadata document are already broken when the metadata is harvested!

It would be good to have a links validation tool that can periodically provide a listing or assessment of currently broken links. Of course, fixing those links will require manual follow-ups, but being able to identify problems is the first step. eg, if ALL links in a metadata record are broken, the metadata is almost useless (except for enterprising and persistent souls who will take the information present in the metadata, Google for it, then follow up directly with the originators; or email the Contact person, if a valid email is present); this also applies to metadata records that never had a link to begin with, which is the case in a number of old records in FGDC format.

As an example, here's a query that returns two records with broken links. Both have broken Zip links as their only link types ("Metadata XML" and "JSON" don't count, in my book; they're internal references to the WCODP GeoPortal).

tchaddad commented 9 years ago

Interesting enhancement - I wonder if this would be for administrator use (in the Geoportal backend), or as a side car to the UI somehow.

There are service status tools available, either for hosts to run directly, or for remote monitoring of sources e.g. http://registry.fgdc.gov/statuschecker/

But the above is directed at services, rather than plain old web 404 type errors.

emiliom commented 9 years ago

I would say both administrator and ideally a broader group of WCGA contributors/volunteers. Could be also for providers, to help manage their own resources. But I realize that expecting a broad, dispersed group to be on top of this is a social mechanism that's hard to pull off ...

tchaddad commented 9 years ago

A subtle way to handle this might be to go back to the idea of a metadata "detail view" page that could display the content of the XML in human friendly form. It might not be too terribly difficult to build into this view a set of icons that display next to URLs that indicate if the URL is in an up, down or undetermined state (at the time of page viewing).

emiliom commented 9 years ago

Sounds like a good prospect, specially in that it addresses that old standing "need" for a more human-digestible view of the raw, complete metadata.