Background

The frontend directly calls the OpenLibrary API to fetch books information (title, description, isbn, cover image, etc) and is fully responsible for powering the frontend search in which we can search books by title or isbn.

Our backend does not talk to OpenLibrary, all is handled by the frontend that then sends the information to the backend for saving books in the users library. The backend only saves the most necessary information, so we do not for example save cover images.

The problem

The OpenLibrary API is slow and this hinders us from fetching the information in a quick enough manner that does not impact user experience. Searching is exceptionally slow and even getting a cover image can take long enough for it to timeout and in the end not get any image at all. This is okay for now, it works and as it is a free service the performance is nothing to complain about.

Proposed solution

OpenLibrary is open data and does provide us with data dumps that can be downloaded and handled locally. There has been previous work done to take the data dumps and importing it into a database that can be more easily searchable. Doing something similar to this as well as serving all the image covers should give us more control and shorten the distance to the data. We need to be able to do free-text search for titles as well as search by ISBN (we have thus far only focused on ISBN-13). And be able to retrieve different cover image sizes by ISBN.

Some disadvantages to this solution is that OpenLibrary has millions of books in there database and the dumps are around 40 GB in size (excluding all cover images). We will have to develop a pipeline for creating the local database based on data from new dumps that gets released regularly, and ingestion will probably take a long time.

More research on this topic need to be done.

Mozzo1000 / booklogr

Improvements slow OpenLibrary API calls #8

Background

The problem

Proposed solution