internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.13k stars 1.34k forks source link

Added covers appear broken #9836

Open seabelis opened 1 month ago

seabelis commented 1 month ago

Problem

I've noticed some covers appear broken even though a valid cover has been uploaded. Example, https://openlibrary.org/books/OL51711917M/The_best_ghost_stories This does not happen consistently, so I cannot provide steps to reproduce. This cover was added in May 2024, but now appears broken. The edit cover modal indicates no cover was uploaded. Where did it go? There's no history of the cover being removed.

Screenshot 2024-09-02 at 08 56 55 Screenshot 2024-09-02 at 08 57 15

Reproducing the bug

  1. Go to ...
  2. Do ...

Context

Breakdown

Requirements Checklist

Related files

*

Stakeholders

*


Instructions for Contributors

seabelis commented 1 month ago

Here's another example. https://openlibrary.org/works/OL3360282W/Classic_American_Short_Stories?m=history

Screenshot 2024-09-02 at 09 02 50 Screenshot 2024-09-02 at 09 03 00
scottbarnes commented 1 month ago

Some additional information: for the first book, the cover, 14627720-L.jpg, seems to be missing from https://ia600505.us.archive.org/view_archive.php?archive=/35/items/l_covers_0014/l_covers_0014_62.zip.

mekarpeles commented 2 weeks ago

It looks like there was an error very specifically with https://ia600505.us.archive.org/view_archive.php?archive=/23/items/covers_0014/covers_0014_62.zip

i.e. covers_0014 (all sizes) being uploaded when that batch in particular was incomplete. We're investigating whether this was the cover archive pipeline or the result of a manual upload. The other batches seem unaffected but these 5k or so covers are likely gone / cleaned up by the finalize step after it noticed the batches were uploaded to archive.org.

While investigating...

Possible interventions are: