Closed jacobthill closed 3 years ago
An option for this issue may be to expand or enhance the validation around web resources to include a check that they resolve the URL is valid: https://github.com/sul-dlss/dlme-transform/blob/main/lib/contracts/cho.rb#L125
This is captured in the google doc, but maybe first we should check to see if they are valid. I'm pretty sure the application doesn't break when a url doesn't resolve as long as its valid. It seems to only break when you pass something that doesn't look like a valid url. The other issue, of course, is when we re-harvest a set we should test urls and suppress those that no longer resolve so we keep the site clean or broken urls.
@jacobthill I ran a test on the QNL data where we do not add the agg_preview
if there isn't a manifest, this has resolved the breaking issue after re-transforming/indexing.
However - as stated elsewhere I think this is only a half measure. For discussion, what do we want to include in agg_preview (if anything) when there is no thumbnail URL or it is invalid?
Currently if a bad url is loaded in the agg_preview field (probably agg_is_shown_at as well), the following error message is displayed when clicking on the set of records containing the bad url (e.g. through selecting the data contributor):
Selecting
exhibit dashboard
to unload the records results in the same message.Loading bad urls is part of the transformation process. I have error checking locally but it only tests if the url is valid not if it is resolvable. I also forget to run that check sometimes. In some cases, particularly when the collection is large, it is difficult to find the record with the bad url.
Desired behavior:
exhibit dashboard
to unload the records.dlme_source_file
value (if available) and the id instead of "We're sorry, but something went wrong."