Closed anewuser closed 6 years ago
Right, that was not possible so far. I now modified the extract block to support concat(), a xpath function meant to do exactly that. Please see https://www.pipes.digital/pipe/boNyzPOP for an example on how it can be used to combine those .question
and .answer
. Hope this helps :)
It works, but there's something else I didn't mention. The HTML tags (<p>
,<em>
,<ul>
) are stripped from the answer code, so paragraphs and lists are lost and everything is put together in a single block. Compare the output:
You're right. And I'm not able to modify that xpath expression to concatenate the html of those elements.
I did enable raw (inner_)html output for regular xpath expressions. So it was definitely useful to look into this. But the concat just does not work like this, at least so far.
I will have to think about the best solution here. Offer a custom concat for the exctract block that preserves the raw content of an element? Have a separate block that can concatenate strings? Is there an alternative to xpath that could be offered as an alternative block?
Wouldn't it be possible to use a simple multiple-element CSS selector like .question, .answer
in an extract block for this and put them together with their raw code?
Right now this creates two separate feed posts, but is there any case where that's the intended behavior?
I think that can be quiet often the intended behaviour. Just imagine that there are multiple types of news on the site, .sport_news
and .top_news
, and you want to collect all of them. Then you might write such a combined selector and send them to the feed builder.
And I think I could not implement this. If .question, .answer
is finding 10 questions and 5 answers, it would not be helpful to have all of them in one single item. I think it is more http://www.learnersdictionary.com/qa/post/latest that is the edge case, because here the concatenation is simple and useful :)
I'm leaning towards "implement a concat_raw in the extract block that takes the given selectors, gets their inner html and finally merges them together". Doesn't it sound like the right solution? But I am also still wondering whether a generic concatenation block could be useful, and how it would look like.
I see, so that's by design. I've been using Feed43 for a long time, but I think that all feeds I've ever created followed a strict pattern.
I did imagine cases like your example with Pipes users, but thought that people would just extract the different sections into different feed blocks and then recombine them.
As for the new raw blocks, test them with unwanted tags like script
to make sure there's no security problem. Maybe people will want to keep iframes and media tags too, but I think scripts should be always removed.
Here's what that feed looks like in the Feed43 editor, in case you're curious:
Thanks for that image, lead me in an interesting direction. I added a merge items block that should solve this use case, see https://www.pipes.digital/pipe/YPOdK0ND
Thanks for looking into it. The design of the page you linked to is broken on my 1366x768 screen, though. I can't see the last block or move it. I've tried it on Firefox and Opera:
If you use a different browser than Firefox you could click on the blue background and drag it around (there is a bug in FF preventing that to work). But that block is just a feed builder block, merge items connects to the content input.
I know the point was to show me the new feature, sorry. I just wanted to report the resolution bug before I had some time to test it.
You've used xpath in your example, but actually simple .class
selectors work too.
Here's the final feed: https://www.pipes.digital/pipe/YPOdgaND
Thanks a lot. 😄
I'm trying to recreate this feed with Pipes, but the contents for each post come from two separate divs, and apparently you can't do that with Pipes, right?
https://feed43.com/learnersdictionary.xml
https://www.pipes.digital/feed/YPOdgaND
The source URL is: http://www.learnersdictionary.com/qa/post/latest (this always redirect to their most recent article).
Each post body should be generated from
.question
and.answer
.I've tried extracting
.question, .answer
with a single block, and extracting them separately and then using a combine block, but both resulted in two separate posts.