feediron / ttrss_plugin-feediron

Evolution of ttrss_plugin-af_feedmod
https://discourse.tt-rss.org/t/plugin-update-feediron-v1-2-0/2018
MIT License
204 stars 34 forks source link

Help in setting up a filter config #198

Closed mariomaz87 closed 5 months ago

mariomaz87 commented 5 months ago

Expected Behavior

Hi, sorry to open an issue even if this is more a request for help, I don't think it's an actual bug. I have a website with many feeds like this: https://www.ilfoglio.it/economia/rss.xml I want to create a filter in feediron to discard articles when the linked articles contains the words "abbonati per" in the contents.

Current Behavior

Using the following config, I get no syntax error but both the test and the actual feeds are still displaying articles I want to omit:

{
    "ilfoglio.it": {
        "filtering": {
            "rules": "xpath",
            "rule": {
                "match": "(.\/\/link)[contains(., 'abbonati per')]",
                "action": "omit"
            }
        }
    }
}

Steps to Reproduce

Insert the config in feediron plugin.

Context

dugite-code commented 5 months ago

Your config isn't a valid config, please refer to the extensive Readme for correct config formatting, and the complete config example provided.

The syntax check is extremely simple and only verifies you have entered valid json. Feediron fetches the article full-text and formats it. It has no ability to perform any actions like dropping an article.

That said as part of the full-text process you could set an article tag and then use the TT-RSS content filters (preferences -> Filters Tab) to drop that tag.

A rough example:

{
  "ilfoglio.it": {
    "type": "xpath",
    "xpath": [
      "section[@class='article'"
    ],
    "tags": {
      "type": "xpath",
      "replace-tags": false,
      "xpath": [
        "p[contains(text(),'abbonati per']"
      ]
    }
  }
}
mariomaz87 commented 5 months ago

Thanks for the feedback. I read the documentation and your reply. Can I use the tags with the Readability type? Because without using Readability I'm not able to intercept the contents of the article. I'm trying with this:

{ "ilfoglio.it": { "type": "readability", "tags": { "type": "search", "pattern": [ "\/abbonati\/" ], "match": [ "!paywall" ] } } }

I don't know why it works only using the "!" match. Using this config it tags the article Paywall even if the word "abbonati" is in the result section of the testing page. If I use "paywall" instead of "!paywall" the test does not run.

Thanks!

mariomaz87 commented 5 months ago

I made another modification using regex and it seems to be working:

{ "ilfoglio.it": { "type": "readability", "tags": { "type": "regex", "pattern": "\/Abbonati per continuare a leggere\/" } } }

Now the tag "Abbonati per continuare a leggere" is saved if the sentence is found in the articles contents.

dugite-code commented 5 months ago

Great to see you got it working, hope that all works out for you.