feediron / ttrss_plugin-feediron

Evolution of ttrss_plugin-af_feedmod
https://discourse.tt-rss.org/t/plugin-update-feediron-v1-2-0/2018
MIT License
204 stars 34 forks source link

Reformat URLs on Multipage-Handling #199

Closed monofox closed 4 months ago

monofox commented 4 months ago

Please answer the following questions for yourself before submitting a pull request. YOU MAY DELETE UNUSED SECTIONS.

Bugfix/Enhancement

Proposed Changes

Situation: Having a Article link, which needs to be reformatted with regular expressions like:

"reformat": [
        {
            "type": "regex",
            "pattern": "\/\\.html$\/",
            "replace": ".amp.html"
        }
    ]

If multipage handling is used and configured like:

"multipage": {
        "xpath": "ol[@class='list-pages' and not(@id='atoc_line')]\/li\/a[text() != '\u203a']",
        "append": true,
        "reformat": true // <== this is the new option!
    },

the corresponding links to the next pages are extracted and fetched in order to append to each others.

Unfortunately, the news paper does provide a link for the next pages, which must be modified as well in order to proper fetch it (otherwise it would show a consent banner instead of the content as feediron does not support the cookies yet).

Therefore, this pull request is bringing a new "reformat" option to the multipage handling in order to re-apply the root reformat option.

Example (with dummy urls, with new option):

Is it fine or do you have any other kind of contribution guidelines i didn't found?

Thank you and best regards, monofox

dugite-code commented 4 months ago

Fantastic, thanks for your contribution