pulibrary / figgy

Valkyrie-based digital repository backend.
Other
35 stars 4 forks source link

Spike on making PDF derivative generation more resilient to disk failure or slowness #6436

Closed tpendragon closed 6 days ago

tpendragon commented 1 week ago

We've noticed disk problems on our system mounts, and we're seeing a lot of problems with generating PDF derivatives, as in #6366.

This is a test to see if we can improve things without completely rehauling how we do PDF derivatives.

Acceptance Criteria

First Step

Find a way to reproduce this scenario? Or fake it in a test.

Implementation Tips

Maybe for each page see if it's zero bytes, and if so throw our own error so the job will retry.

Sudden Priority Justification

PDFs are breakin' all over the place.

tpendragon commented 6 days ago

6446 seemed to help a lot. One thing is it looks like sometimes VIPS doesn't make the file where we told it to - or maybe at all. See https://app.honeybadger.io/projects/53391/faults/108852782. Retrying on ENOENT might be a good idea.

tpendragon commented 6 days ago

I regenerated ~ 12 broken PDFs on https://figgy.princeton.edu/catalog/3931aa55-afd6-4c29-88bc-afaac7251f38 and it all came through green.