fossar / selfoss

multipurpose rss reader, live stream, mashup, aggregation web application
https://selfoss.aditu.de
GNU General Public License v3.0
2.35k stars 343 forks source link

question about "items_lifetime" #1480

Open herrxyz opened 2 months ago

herrxyz commented 2 months ago

hey there, I don't think I really understand what "items_lifetime" does. According to documentation, "Number of days since the item has been last seen after which it can be deleted. Set to 0 to disable item deletion. Starred items will never be deleted.". I thought this means that items / articles gets deleted after this time, for example after 30 days. I expected that after setting this number to 15 and pressing "update", a lot of old articles would get deleted and therefore the number of articles would get lower (number of articles shown in webinterface didnt change). Please explain what this parameter exactly does and, if possible, how to delete old articles (selfoss is getting slow after several years) using https://hub.docker.com/r/rsprta/selfoss but migrated my years old sqlite-db into this image

jtojnar commented 2 months ago

Hi. I feel your pain. My selfoss.db is 434 MB and selfoss is pretty sluggish for me.

The articles should indeed get deleted after the update finishes. Just note that the configuration value refers to the number of days since the article was last seen in the feed. So if the feed still contains the article, it will not be removed because it would likely just be re-added on the next update.

I double checked by reducing the limit from 1000 to 30 on a copy of my database and it indeed deleted most of the items (and reduced the file size to 29 MB). You should be able to verify with the following command – it will print counts of items grouped by date they were last seen (as date|age|count):

$ sqlite3 data/sqlite/selfoss.db 'select date(lastseen), cast(julianday(date()) - julianday(date(lastseen)) as int), count(*) from items where starred = false group by date(lastseen)'

Unfortunately, I do not think there is currently much to do other than trying to remove some feeds.

I have optimizing the database on my to-do list for 2.20 but I now only have access to my main computer at weekends so I am not sure when I will be able to tackle it.