RSS-Bridge / rss-bridge

The RSS feed for websites missing it
https://rss-bridge.org/bridge01/
The Unlicense
7.23k stars 1.03k forks source link

[XPathBridge] Have a way to use the image URL inside content rather than as attachment #2843

Open imagoiq opened 2 years ago

imagoiq commented 2 years ago

Hi,

I wonder if it could be possible to have parameter allow using the image URL as an HTML tag inside the content rather than having it as an attachment. For some feeds, this will be much appreciated. I'm uncertain if it's a design decision (regarding security, for example) or just not thought. Also, if there is a way to do it with a (complicated) xpath function, I don't mind.

Thanks in advance

cc maintainer @Niehztog

Niehztog commented 2 years ago

Hello @imagoiq as far as I understand what you intend to do, it should be sufficient and feasible to implement that functionality right inside your specific bridge rather than in the abstract XPathAbstract parent class. Let's see if we can manage to accomplish that together: In your bridge class, please override the parent method getItems for example like this:

public function getItems() {
    foreach($this->items as $item) {
        //modify each item's content here, ex.:
        $content = $item->getContent();
        $enclosures = $item->getEnclosures();
        $content .= $enclosures[0];
        $item->setContent($content);
    }
    return $this->items;
}

Something like this should do (I did not test that). Please try it out and let me know if that helps. Based on your experiences, we might be able to implement a new helper function for that in XPathAbstract later. Maybe you have good ideas how to meaningfully implement that?

[Edit] I see that you were talking about XPathBridge rather than implementing your own bridge. In order for my above instructions to work, you should put your XPath expressions in your own bridge implementation as described here.

Niehztog commented 2 years ago

@imagoiq There is another way to do it using more complicated xpath expressions using the concat instruction. For example in your xpath expression you could use:

concat(.//div[@class="body-text"]/text(), " ", .//img[1]/@src)

This way you can concatenate two nodes to one item and use that for example as the content xpath expression.

imagoiq commented 2 years ago

Hi @Niehztog,

Thanks for your proposition and taking time to answer.

For the context of this issue, I can add that I was a bit surprised when I used the first time the bridge that the expression under "Item image selector" will add my image as an attachment. If you compare with services, like Politepol for example, the image most of the time get directly into the content, so it's rather confusing from a user standpoint (this could be fixed by changing the label name, but anyway this is not my main concern).

From your answers, I'm uncertain whether you saw my PR proposal or not: https://github.com/RSS-Bridge/rss-bridge/pull/2847. The last idea is to have a post-processing function in the abstract that you can call and then easily manipulate the fields.

After some thinking, I think the idea to rather use a xpath expression for that is probably too verbose and difficult. If I'm using your expression (e.g. with the example mention here: https://github.com/RSS-Bridge/rss-bridge/issues/2842). I would need to add as well an HTML img tag to render the picture inside the content and as well the root path of the URL as the image use a relative URL. So much work to do and not access to any of the function to clean the URL either.

Niehztog commented 2 years ago

@imagoiq I agree with your argumentation and I like your idea to have a post-processing function in the abstract that you can call to manipulate the fields. My proposed way to overwrite the getItems() method provides a similar, but less obvious solution, which would not require and code changes to XPathAbstract. So basically I think it's optional. I was not aware of your PR https://github.com/RSS-Bridge/rss-bridge/pull/2847, but I support it. Let's hope the maintainers will consider it (I do not have the access rights to accept PR's).