Closed uqs closed 2 years ago
Sorry for the late reply github is really terrible at notifying me when an issue is opened.
The xpath filter only grabs a single index/occurrence in order to join them you need to manually select each instance.
If you want to select all the instances of an xpath you need to use the all_xpath filter. Note: I've marked it experimental only because I don't use it personally: https://github.com/feediron/ttrss_plugin-feediron/tree/master/filters/fi_mod_all_xpath
Expected Behavior
The arstechnica.com recipe is broken and/or they changed their site layout so that the extracted element is often just half of the content.
Recipe Code
Context
Ignore the modify regex, that is not the problem. I've only this example article at hand and this is not supposed to be a political statement or anything (I'm just curious what all this Impostor stuff is actually about)
https://arstechnica.com/gaming/2020/10/aocs-twitch-streaming-debut-attracts-over-435000-among-us-viewers/
Run that article through the filter, and you'll notice that the bottom half of the article is missing.
The article structure is roughly like so:
The filter grabs the first article-content and runs with it. So I changed it to:
Because in Chrome, I can select it in the console using:
$x("//section[@class='article-guts']")[1]
But in feediron, this results in all content getting dropped (and then the fallback to displaying the full HTML).I'm confused as to how XPath works and how it works in Feediron and whether it would concatenate 2 expressions or whatever. Just running with the single filter of:
"section[@class='article-guts'][last()]"
results in, you guessed it, the first article-guts content getting displayed, not the 2nd or last one.Help? Does feediron extract both XPaths and concatenates them? How can I get it to extract both article-guts classes? Why does it think the forward slashes need to be escaped and re-writes them?