miniflux / v2

Minimalist and opinionated feed reader
https://miniflux.app
Apache License 2.0
6.97k stars 728 forks source link

[Feature Request] Allow filter rules based on specific tag #951

Open garritfra opened 3 years ago

garritfra commented 3 years ago

As a user, I want to be able to add a filter rule to a feed based on a specific xml tag. Currently, filtering is only possible on the title.

Concrete example

Considering a feed that has these entries:

<item>
    <title>My second post</title>
    <link>https://example.com/2</link>
    <description>Some content</description>
    <author>Foo Bar</author>
    <category>dev-blogs</category>
    <pubDate>Tue, 05 Jan 2021 01:50:00 GMT</pubDate>
</item>
<item>
    <title>My first post</title>
    <link>https://example.com/1</link>
    <description>Some more content</description>
    <author>Bar Baz</author>
    <category>news</category>
    <pubDate>Sun, 03 Jan 2021 11:00:00 GMT</pubDate>
</item>

I want to be able to only subscribe to entries marked with the category "dev-blogs".

Possible solution

Add the possibility to specify tags in Keep/Block rules. If the rule is a tag (regex/raw character parsing?), query the entire entry. If not, continue as previously implemented, by just querying the title.

image

Resources

This feature could be implemented here: https://github.com/miniflux/v2/blob/de7a61309878e335fe99c7afbbe6a40cc097b0ae/reader/processor/processor.go#L86

Context

I stumbled upon this issue when I wanted to subscribe to the dev-blogs of a game. The authors only have a single rss feed for all their content.

garritfra commented 3 years ago

Fiddling around with this, I found that it's not as easy as I expected. In order to query non-standard attributes (like category), the raw entry would need to be stored. Another way would be to add an "extra tags" field, that itself is a key-value map.

Is there a simpler way to query ever field in the original entry?

JKingweb commented 3 years ago

For what it's worth, having anticipated the usefulness of this for The Guardian's rather large news feed in my own (forthcoming) independent implementation of Miniflux's API, I simply had rules try to match against the title, and in the absence of a match, each category name until a match is found. So, a keep rule of ^dev-blog$ would suffice.

The downside to this is that since Miniflux doesn't actually expose RSS/Atom categories, it might be confusing for users. I'd be keen to discuss the best, least confusing way to expose such a feature.

x0tester0x commented 1 year ago

Since the parser for the from Feeds (RSS, Atom and JSON) is already merged an implemented, when and how can I use this in the block or keep rules?

x0tester0x commented 11 months ago

Please implement this and update the Documentation.

privatmamtora commented 4 months ago

This was implemented here: #2526 Just documentation needs to be updated

shanewstone commented 3 months ago

This was implemented here: #2526 Just documentation needs to be updated

Please correct me if I am mistaken, but that PR only addresses global filter rules. The request here is for fields used in filter rules applied to individual feeds. The fields used to filter within specific tags (e.g., EntryTitle, EntryContent) cannot be used in a feed's keep or block rules, only in the global filter rules.

jacob-faber commented 2 months ago

@shanewstone , Yes, that's correct. Individual feed filtering needs to be refactored to use the same logic (EntryTitle, EntryContent). For my personal use, I used this patch https://github.com/miniflux/v2/compare/main...jacob-faber:miniflux-v2:filter-content to match the content of individual feeds.