wellcomecollection / platform

Wellcome Collection Digital Platform
https://developers.wellcomecollection.org/
MIT License
48 stars 10 forks source link

Look for manifests we point to on /works that return a 500 error #4963

Closed alexwlchan closed 3 years ago

alexwlchan commented 3 years ago

A member of staff spotted a broken item on /works yesterday; this was caused by a IIIF manifest that was returning a 500 error in DLCS. Slack discussion here: https://wellcome.slack.com/archives/C8X9YKM5X/p1610365766200100

We should check all the IIIF manifests we're serving in the catalogue API, and look for anything that's similarly broken.

alexwlchan commented 3 years ago

Here's a Python script that finds all the IIIF manifests in a works snapshot:

#!/usr/bin/env python

import gzip, json, httpx

def locations():
    for line in gzip.open("works.json.gz"):
        work = json.loads(line)
        for it in  work["items"]:
            yield from it["locations"]

for loc in locations():
    if loc["type"] == "PhysicalLocation":
        continue
    assert loc["type"] == "DigitalLocation"
    if loc["locationType"]["id"] == "iiif-image":
        continue
    assert loc["locationType"]["id"] == "iiif-presentation", loc
    print(loc["url"])

And once you have that list in a text file, you can check it with the following shell script

cat iiif_manifest_location_urls.txt \
  | xargs -I "{}" -P 10 curl -s -o /dev/null -w "{} %{http_code}\n" "{}" > iiif_checked.txt

I'm running this and will post results when it's done.

alexwlchan commented 3 years ago

Here's a spreadsheet with all the affected IIIF manifests, grouped by error message:

iiif_errors.xlsx

tomcrane commented 3 years ago

FYI Everything beyond line 34 is an error thrown by enforcing a business rule:

tomcrane commented 3 years ago

The "Sequence contains more than one matching element" are where some METS XML should be one and one element only but has more than one (throw exception rather than just use the first). And the first 6 are self-explanatory - can't find in storage.

alexwlchan commented 3 years ago

FYI Everything beyond line 34 is an error thrown by enforcing a business rule:

Yeah, I think DLCS is doing the correct thing here – the issue is somewhere in the underlying metadata or the catalogue data, because we're presenting these manifests as items that should be visible on /works. And they're not, so either we should change the data to make them visible, or stop telling people this is possible.

jamesgorrie commented 3 years ago

I'd like to have some better real time metrics on this. We report when IIIF images are down. I'll have a look at monitoring other services in real time.

aray-wellcome commented 3 years ago

I'll fix the items that do not have a license and will look at any other outstanding bnumbers that are not closed but still are giving a 500 error from lines 1-34.