miniflux / v2

Minimalist and opinionated feed reader
https://miniflux.app
Apache License 2.0
6.5k stars 706 forks source link

Rewrite rules for scraped article content #1206

Open Mihara opened 2 years ago

Mihara commented 2 years ago

Among the feeds I read, there's a site which has its article content in a div.entry-content, but supplements it with a large box of social links, identical for every article, while cutting down the feed entries. When Miniflux is told to fetch original article content, the svg images involved balloon out to occupy 100% width.

As far as I can tell from experimentation and a cursory study of the source, when fetching original article content, rewrite rules are not applied, even though that is where they would be most helpful. Scraper rules are, but they're insufficient to clean out the aforementioned social links.

sa7mon commented 2 years ago

I just came across this issue today. From my attempts, I have found that rewrite rules actually are applied to articles when "fetch original content" is enabled, but you have to set the rules when you add the subscription. If you update the rewrite rules on an existing subscription, it seems to have no effect.

I really hope we can either get this fixed or added to the documentation. The docs are very lacking especially with this.