harvard-lil / perma

Indelible links
411 stars 70 forks source link

Don't assume provenance summary is present. #3419

Closed rebeccacremona closed 8 months ago

rebeccacremona commented 8 months ago

We are observing that for some small percentage of Scoop captures, attachments (including the provenance summary) are absent.

This PR makes sure that's not counted as a failure: if we have a decent warc, we shouldn't be too upset if we don't have all the metadata we wanted.

I made a new Grafana panel so we can watch these go by.

Once this is up on prod, I'll clean up the records from over the weekend.

codecov[bot] commented 8 months ago

We're currently processing your upload. This comment will be updated when the results are available.

rebeccacremona commented 8 months ago

Here's how I plan to clean up:

from perma.models import Capture

captures = Capture.objects.filter(
     role='primary',
     status='success',
     link__capture_job__engine='scoop-api',
     link__capture_job__status='failed'
)

captures.count()

for capture in captures:
   capture.link.tags.add('scoop-exception-while-finishing-job')

captures.update(status='failed')

This will correct ~90 records.