miniflux / v2

Minimalist and opinionated feed reader
https://miniflux.app
Apache License 2.0
6.45k stars 706 forks source link

Per-domain scrape rule #333

Open yegle opened 5 years ago

yegle commented 5 years ago

Google News RSS feed contains link to different sites and they all need different scrape rule.

Example of such an RSS feed: https://news.google.com/news/rss/headlines/section/geo/SanFrancisco

yegle commented 5 years ago

I guess what I actually want is https://github.com/miniflux/miniflux/blob/master/reader/scraper/rules.go but as a flag or something that doesn't require upstreaming the changes first and wait for the next release.

qjebbs commented 5 years ago

I guess we can have an ENV, like: SCRAPER_RULES="path/to/rules.json",to make it configurable to users.

fguillot commented 5 years ago

You can still define scraper rules for each feed via the user interface (edit feed page).

yegle commented 5 years ago

Yes you can define scrape rule per-feed but not per-domain. If you check the RSS feed in the original post it contains posts from different domain.

fguillot commented 5 years ago

Ok, I see.

somini commented 4 years ago

Not sure this is better suited to something like RSS Bridge. This gets hairy fast if you try to correctly parse the entire Internet.

https://github.com/RSS-Bridge/rss-bridge