lemon24 / reader

A Python feed reader library.
https://reader.readthedocs.io
BSD 3-Clause "New" or "Revised" License
434 stars 31 forks source link

delete_entry for entries created by update_feeds method #301

Closed kei49 closed 1 year ago

kei49 commented 1 year ago

Thank you for maintaining this library. This is an issue for a feature request.

Desired

Support delete_entry for entries created by update_feeds method

Current behavior

When you try to delete an entry by using delete_entry, the error will be raised below if the entry was created y update_feeds. This restriction was declared on the API reference

reader.exceptions.EntryError: entry must be added by 'user', got 'feed':

Background

I managed some feeds periodically using reader.update_feeds(), so to sync with the latest entries for each feed. Due to the restriction of database (supporting only sqlite), the library has been crushed frequently maybe because of too large storage. In order to solve this issue, I tried to delete old entries for each feed, rather than deleting feed storage. However, the feature requested here was required to do that.

lemon24 commented 1 year ago

Hi, thank you for opening this!

tl;dr: It will be some time until we have reader.delete_entry(), because "properly" deleting entries is non-trivial. For now, you can use reader._storage.delete_entries(), with some caveats.

Not being able to delete entries is a known issue, tracked in #96 (probably one of the oldest open issues).

Deleting entries properly is non-trivial for two reasons (more details in the Open questions part of https://github.com/lemon24/reader/issues/96#issuecomment-1236304134):

storage.delete_entries() works for all entries, but does not handle the cases above (hence the limitation on the high level API). Depending on your use case, this may not be an issue, so feel free to use it (it's not part of the stable/documented API, but it is extremely unlikely to change in any way.)

Unfortunately, I don't have any estimate for when I'll be able to work on #96, I don't have a lot of free time at the moment.

lemon24 commented 1 year ago

the library has been crushed frequently maybe because of too large storage

Can you please provide more details on this, so I can better understand your use case?


For reference, my database has 18k entries in 170 feeds.

On a t4g.nano AWS instance (2 vCPU, 0.5 GiB), the /?limit=64&show=all web app page renders in 80ms (it ends up calling get_entries(limit=64) underneath, but I don't have timings for the actual call).

On my 2013 laptop, with the same db:

In [6]: %time _ = list(reader.get_entries(limit=64))
CPU times: user 9.12 ms, sys: 999 µs, total: 10.1 ms
Wall time: 9.34 ms