dewey / miniflux-sidekick

A sidekick for Miniflux, filter out items by regex or tags. Compatible with killfiles. 🔪
MIT License
47 stars 8 forks source link

The negation rule doesn't work as expected. #13

Open imjoeyli opened 1 year ago

imjoeyli commented 1 year ago

It seems that when the negation rule (!# or !~) is present along with other positive rules (# or ~), it doesn't work as expected.

The filter seems to ignore the negation rule's previous "<feed>" that should be matched, causing all feeds to be applied with the negation rule, resulting in those that should not be filter being marked as read.

By the way, I would like to know how to change the log-level to debug, now $ docker-compose logs app doesn't show the related logs.

imjoeyli commented 1 year ago

For testing purposes, my killfile rules are as follows.

ignore-article "https://sspai.com/feed" "title !~ (派评|一日一技|看什么|App\+1)"
ignore-article "https://www.macstories.net/feed/" "title =~ (MacStories\sUnwind|Episode\s\d|Club\sMacStories|Kolide)"
ignore-article * "title =~ \[Sponsor\]"

The result is, except for posts after filter inhttps://sspai.com/feed, all posts in other feed are marked as read.

dewey commented 1 year ago

Hey @imjoeyli, thanks for reporting this. I don't currently have time to investigate that, so if you want to give debugging this a shot I'd be happy to review it.

imjoeyli commented 1 year ago

Hey @imjoeyli, thanks for reporting this. I don't currently have time to investigate that, so if you want to give debugging this a shot I'd be happy to review it.

I'm not a professional, just a self-hosted amateur. But I can learn to have a try.

Would you please tell me how to change the log level to debug?

dewey commented 1 year ago

No worries, if you want to change the log level you can set the environment variable MF_LOG_LEVEL=debug. This should get you the desired result.

The code for this is here.

itohsnap commented 1 year ago

I'm not a Go programmer, so forgive me if I've missed something, but I think the issue is in /filter/service.go:

RunFilterJob() basically loops through all feeds. If a Feed URL contains any (rule) URL, then all unread entries for that feed are passed to evaluateRules(). Notably, if you have any wildcard/"*" (rule) URLs, then every feed will match a rule and all unread entries will be passed to evaluateRules().

evaluateRules(), for each entry, then loops through all rules to see if each rule's Filter Expression applies. If it does, then it returns that the entry should be marked as read.

Critically, evaluateRules() does not consider if each (rule) URL applies to the entry being evaluated.

If I've interpreted this properly, then the issue is actually that all rules are being applied to all unread entries from feeds covered by at least one rule. It just shows up much more dramatically, in general, with negation rules.

(Again, apologies if I'm off in left field.)

--

(Incidentally: line 118 of service.go tests if tokens[1] == "description"... should that be "content", instead? "Description" makes it sound like the rule is testing against the feed's general description, not the content of each entry.)

dewey commented 1 year ago

Thanks for doing that investigation, that's a great lead! I'm not sure when I'll have time to take a look at that, but it'll be helpful for that or if someone else picks it up in the meantime.