openzim / cms

ZIM file Publishing Platform
https://cms.openzim.org
GNU General Public License v3.0
4 stars 0 forks source link

Auto deleting books #71

Closed rgaudin closed 2 years ago

rgaudin commented 2 years ago

Books are created via a Zimfarm call.

At that moment, ZIMs are probably online or about to be. We add them to the Title (or create it if it's the first) and that's it.

Over time, we end end with many books for a Title.

We know that we are currently keeping at most 2 dates ZIM files per content on the server so if we have 3 books for a Title in the DB, chances are one of them is not reachable.

We should write down two things IMO:

kelson42 commented 2 years ago

@rgaudin The CMS should delete the books, whatever for which reasons. We should not have any other logic, anywhere else about book deletion IMO.

rgaudin commented 2 years ago

Agrees but:

kelson42 commented 2 years ago
  • we're not there yet and we'll need to delete books if we want to play with the generated library. I guess we'll do what I suggested in the interval should we not implement deletion first.

In the meantime, we should probably keep continuing to delete books like we do today. On the CMS side, I'm anyway in favour of having a periodic check if ZIM files are still online. If the file is in a library but not available anymore, then it should trigger an alarm, otherwise I guess it could be removed from the DB silently.

  • Doesn't answers the question of when a book is considered expired : when it should be deleted. Doesn't need to be decided now but it's an important decision for M2.

The current logic is appropriate IMO for the moment, I would keep it. Otherwise, we could maybe trigger a deletion in N days, each time a ZIM file leaves the library. Probably we will have to hardcode a few deprecation approaches and make them available to choose at the "title" level.

See also #7 for ondemand deletions.

rgaudin commented 2 years ago

OK 👍

rgaudin commented 2 years ago

Turning this into an actionable task for M2, we only want to remove the N previous versions of the book, when we receive and record a new one.

That number would be a constant and we could start with 3.