Describe the bug
When an RSS item has both a <description> and a <content:encoded> element, the content:encoded element is always used as the article body. But, in some feeds, content:encoded is present but empty, and description has actual content (e.g. a summary). This results in the item appearing to have no body content in Vienna.
In general the existing behavior makes sense because the "RSS Best Practices" recommends that the use of a publisher including both would be to store the full article in content:encoded. Given that RSS feeds are the wild west and that document is not really a standard (nor does it dictate what reader applications should do in this case), I suggest that Vienna should handle this case better.
At the least, the value of the articleBody variable shouldn't be overwritten with an empty string, if the content:encoded element is empty. I will submit a pull request.
Describe the bug When an RSS item has both a
<description>
and a<content:encoded>
element, thecontent:encoded
element is always used as the article body. But, in some feeds,content:encoded
is present but empty, anddescription
has actual content (e.g. a summary). This results in the item appearing to have no body content in Vienna.In general the existing behavior makes sense because the "RSS Best Practices" recommends that the use of a publisher including both would be to store the full article in
content:encoded
. Given that RSS feeds are the wild west and that document is not really a standard (nor does it dictate what reader applications should do in this case), I suggest that Vienna should handle this case better.To Reproduce Subscribe to this feed: https://feeds.a.dj.com/rss/RSSWorldNews.xml (Wall Street Journal World News) and look at the entries.
Attached is a cached version of this feed. folder55.xml.txt
Excerpt:
Screenshots
Please complete the following information:
Additional information:
Relevant source code: https://github.com/ViennaRSS/vienna-rss/blob/8dae3483fe1ae9f5d1a514d76351c3490d969f15/Vienna/Sources/Parsing/RSSFeed.m#L230-L236