feedreader / pluto

pluto gems - planet feed reader and (static) website generator - auto-build web pages from published web feeds
Creative Commons Zero v1.0 Universal
192 stars 14 forks source link

Delete items which are removed from a feed #16

Closed harry-wood closed 5 years ago

harry-wood commented 5 years ago

Trying to fix this spam problem on blog.openstreetmap.org: https://github.com/gravitystorm/blogs.osm.org/issues/17 I've come up with this logic which I believe will solve it.

I've added a test. I have now also tested this logic end-to-end within a full pluto set-up, and it seems to work as expected. I've only done that by hacking the gem internals. I didn't figure out how to build this into a gem properly. The multi-gem set-up with hoe is confusing to me at the moment.

Also you may prefer to make this behaviour optional, as a thing users can switch on for a particular feed (so a config option for the ini file entries, rather like filters). I took a look at doing this, but for similar reasons passing the option through multiple gems seemed tricky.

Having said that, I could make a case that the "delete removed" behaviour is actually desirable as reasonable default. For most blog feeds most of the time it will make no difference (find nothing was removed), but presumably if a source blogger ever wants to delete an entry they published by accident they'll run into the same problem we're seeing with deleting spam diary entries.

geraldb commented 5 years ago

Sorry for the late reply. I was on vacation (offline) until today. You have of course my full support. Thanks for using pluto for the open street map planet. As a quick review I'm happy to merge and you're welcome to join as a committer if interested (just let me know and I send you an invitation via github) - the only addition would be a something like a feature / command line flag (e.g. --delete) that lets you toggle the (new) delete functionality since not everybody might want to use it. I will try to add it in the next days (and push out a new gem).

harry-wood commented 5 years ago

feature / command line flag (e.g. --delete) that lets you toggle the (new) delete functionality

Yes, that could be a good idea, although as I a say, I don't think it should be a problem running with this over any blog source. I think ideally it would be a per feed option within the .ini file config.

geraldb commented 5 years ago

Thanks for your patience. Since you're basically the only user as far as I can tell I'm with you. Let's keep it simple - I will just add some comments (for some possible future work). I will try to push out a new gem later this week.

harry-wood commented 5 years ago

Well to be clear, I'm one OpenStreetMap enthusiast interested in fixing the spam problem, but I'm sure the OpenStreetMap sysadmins (people like @tomhughes ) would appreciate your vigilance in ensuring I don't commit any crazy code to pluto :-)

They may also feel more comfortable with the ability to configure this new logic in the .ini file, to apply only to the OpenStreetMap diaries (Although I maintain that this is not strictly necessary the code won't cause any adverse effects for other sources)

In fact I'm currently demoing this working here: https://harrywood.dev.openstreetmap.org/blogs.osm.org/build compare with https://blogs.openstreetmap.org

And in the logfile here: https://harrywood.dev.openstreetmap.org/blogs.osm.org/buildscript.log We can see it doing a "DELETE" only on spam entries in the osm diaries. It doesn't go crazy deleting anything it shouldn't be.

What d'you think @tomhughes ? If @geraldb packages this up as a new pluto gem, would you be comfortable rolling with it? Or d'you think we should ask gerald to make it an .ini file option? (I did have a go at doing this, but couldn't work out how to do it across the different gems)

geraldb commented 5 years ago

No worries - if anything breaks we can change / fix / improve the code (and push out a new gem). Thanks again for the detailed report and testing.

geraldb commented 5 years ago

@harry-wood FYI: Your delete removed feed items change is now "gemified", that is, you can use the lastest gem, that is, pluto-models v1.5.1 https://rubygems.org/gems/pluto-models that includes your changes. Thanks again. Cheers.

harry-wood commented 5 years ago

now "gemified"

Thanks!

I didn't realise, but @tomh has done the bundle update to put this change live on https://blogs.openstreetmap.org In fact he did that a few minutes after you said that. I think he was keen to zap that spam! :-)

geraldb commented 5 years ago

Thank you. Great to see the zap spam machinery live. Have a great weekend. Cheers.