martinrotter / rssguard

Feed reader (and podcast player) which supports RSS/ATOM/JSON and many web-based feed services.
GNU General Public License v3.0
1.63k stars 125 forks source link

[BUG]: Images don't load in article previewer if image URL in articles is not spaced out #785

Closed RetroAbstract closed 2 years ago

RetroAbstract commented 2 years ago

Brief description of the issue

Video thumbnails from articles previewed in the article previewer will not load if the image URLs.

It appears RSS Guard cannot recognize an image to be loaded from an article in the article previewer if the image URL is immediately followed by text (no spacing).

Using the following example comparing the same video previewed from an Invidious article & Youtube article, RSS Guard loads the thumbnail in the Invidious article as there is proper formatting to the text:

Invidious article preview: Invidious

YouTube article preview: YouTube

This happens for any article where the image URLs aren't spaced out.

How to reproduce the bug?

  1. Add for example a youtube channel RSS feed
  2. Fetch articles from feed
  3. Preview an article in the article previewer
  4. Make sure to Enable external resources in the article previewer
  5. Notice the video thumbnail picture will not load/display as the text is not properly formatted

What was the expected result?

The expected result was to see the image from the image URL in the article displayed in the article previewer.

What actually happened?

The image from the image URL in the article was not displayed in the article previewer.

Debug log

Actions: Starting RSS Guard dev build 4.2.3 > Opened > Clicking Linus Tech Tips YouTube feed > Clicking the "I've been waiting TOO long" article displaying it in article previewer > Closing RSS Guard.

time=" 0.186" type="debug" -> core: Initializing settings in 'Desktop\rssguard-devbuild-e142522a-nowebengine-win64\data4\config\config.ini' (portable way). time=" 0.214" type="debug" -> database: File-based SQLite database connection 'DatabaseFactory' to file 'Desktop\rssguard-devbuild-e142522a-nowebengine-win64\data4\database\database.db' seems to be established. time=" 0.215" type="debug" -> database: File-based SQLite database has version '2'. time=" 0.221" type="debug" -> gui: Available icon theme paths: (:/icons, :/graphics, Desktop/rssguard-devbuild-e142522a-nowebengine-win64\data4\icons, Desktop/rssguard-devbuild-e142522a-nowebengine-win64\icons) time=" 0.222" type="debug" -> gui: Installed icon themes are: '', 'Breeze', 'Breeze Dark', 'Faenza', 'Numix' time=" 0.223" type="debug" -> gui: Loading icon theme 'Breeze'. time=" 0.223" type="debug" -> gui: Found path of base skin: ':\skins\nudus-base'. time=" 0.224" type="debug" -> gui: Trying to load base file ':\skins\nudus-base\html_wrapper.html' for the skin. time=" 0.224" type="debug" -> gui: Local file ':\skins\nudus-light\html_style.css' exists, using it for the skin. time=" 0.225" type="debug" -> gui: Trying to load base file ':\skins\nudus-base\html_enclosure_image.html' for the skin. time=" 0.225" type="debug" -> gui: Trying to load base file ':\skins\nudus-base\html_single_message.html' for the skin. time=" 0.226" type="debug" -> gui: Trying to load base file ':\skins\nudus-base\html_enclosure_every.html' for the skin. time=" 0.226" type="debug" -> gui: Local file ':\skins\nudus-light\qt_style.qss' exists, using it for the skin. time=" 0.227" type="debug" -> gui: Trying to load base file ':\skins\nudus-base\html_adblocked.html' for the skin. time=" 0.227" type="debug" -> gui: Setting style: 'windowsvista'. time=" 0.228" type="debug" -> gui: Skin 'nudus-light' loaded. time=" 0.228" type="debug" -> network: Disabling application-wide proxy completely. time=" 0.233" type="debug" -> core: OpenSSL version: 'OpenSSL 1.1.1j 16 Feb 2021'. time=" 0.233" type="debug" -> core: OpenSSL supported: 'true'. time=" 0.234" type="debug" -> core: Starting RSS Guard 4.2.3. time=" 0.234" type="debug" -> core: Instantiated class 'Application'. time=" 0.235" type="debug" -> core: Starting to load active localization. Desired localization is 'en_GB'. time=" 0.236" type="debug" -> core: Application localization 'en_GB' loaded successfully, specifically sublocalization 'en_GB' was loaded. time=" 0.236" type="warning" -> core: Qt localization 'en_GB' WAS NOT loaded successfully. time=" 0.239" type="debug" -> database: SQLite database connection 'MessagesModel' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.242" type="debug" -> message-model: Repopulated model, SQL statement is now: 'SELECT Messages.id, Messages.is_read, Messages.is_important, Messages.is_deleted, Messages.is_pdeleted, Messages.feed, Messages.title, Messages.url, Messages.author, Messages.date_created, Messages.contents, Messages.enclosures, Messages.score, Messages.account_id, Messages.custom_id, Messages.custom_hash, Feeds.title, CASE WHEN length(Messages.enclosures) > 10 THEN 'true' ELSE 'false' END AS has_enclosures FROM Messages LEFT JOIN Feeds ON Messages.feed = Feeds.custom_id AND Messages.account_id = Feeds.account_id WHERE 0 > 1;'. time=" 0.243" type="debug" -> core: Auto-download timer started with interval 60000 ms. time=" 0.243" type="debug" -> core: Creating FeedDownloader singleton. time=" 0.247" type="debug" -> gui: Creating main application form in thread: '0x1a70'. time=" 0.260" type="debug" -> gui: Current row changed - proxy 'QModelIndex(-1,-1,0x0,QObject(0x0))', source 'QModelIndex(-1,-1,0x0,QObject(0x0))'. time=" 0.333" type="debug" -> network: Settings of BaseNetworkAccessManager loaded. time=" 0.337" type="debug" -> network: Settings of BaseNetworkAccessManager loaded. time=" 0.344" type="debug" -> network: Settings of BaseNetworkAccessManager loaded. time=" 0.656" type="debug" -> gui: Creating tray icon menu. time=" 0.799" type="debug" -> core: Showing the main window when the application is starting. time=" 0.827" type="debug" -> database: SQLite database connection 'FeedReader' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.829" type="debug" -> database: SQLite database connection 'FeedlyEntryPoint' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.830" type="debug" -> database: SQLite database connection 'GmailEntryPoint' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.832" type="debug" -> database: SQLite database connection 'GreaderEntryPoint' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.833" type="debug" -> database: SQLite database connection 'OwnCloudServiceEntryPoint' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.835" type="debug" -> database: SQLite database connection 'StandardServiceEntryPoint' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.837" type="debug" -> core: Filter accepts row 'User (RSS/ATOM/JSON)' and filter result is: 'true'. time=" 0.838" type="debug" -> database: SQLite database connection 'StandardServiceRoot' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.840" type="debug" -> core: Custom ID of feed when loading from DB is '1'. time=" 0.841" type="debug" -> core: Custom ID of feed when loading from DB is '2'. time=" 0.842" type="debug" -> database: SQLite database connection 'RecycleBin' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.844" type="debug" -> database: SQLite database connection 'ImportantNode' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.846" type="debug" -> database: SQLite database connection 'RootItem' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.847" type="debug" -> database: SQLite connection 'StandardServiceRoot' is already active. time=" 0.848" type="debug" -> database: SQLite database connection 'StandardServiceRoot' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.849" type="debug" -> database: SQLite database connection 'TtRssServiceEntryPoint' to file 'Desktop/rssguard-devbuild-e142522a-nowebengine-win64/data4/database/database.db' seems to be established. time=" 0.850" type="debug" -> gui: User wants to have tray icon. time=" 0.851" type="debug" -> gui: Tray icon is available, showing now. time=" 0.853" type="debug" -> gui: Creating SystemTrayIcon instance. time=" 0.954" type="debug" -> gui: Showing tray icon immediately. time=" 1.092" type="debug" -> gui: Tray icon displayed. time=" 1.093" type="debug" -> gui: Feed list item expanded - User (RSS/ATOM/JSON) time=" 1.093" type="debug" -> core: Filter accepts row 'Recycle bin' and filter result is: 'true'. time=" 1.094" type="debug" -> core: Filter accepts row 'Important articles' and filter result is: 'true'. time=" 1.094" type="debug" -> core: Filter accepts row 'Unread articles' and filter result is: 'true'. time=" 1.095" type="debug" -> core: Filter accepts row 'Labels' and filter result is: 'true'. time=" 1.095" type="debug" -> core: Filter accepts row 'Linus Tech Tips' and filter result is: 'true'. time=" 1.095" type="debug" -> core: Filter accepts row 'Linus Tech Tips' and filter result is: 'true'. time=" 1.096" type="debug" -> core: No execution message received from other app instances. time=" 1.232" type="debug" -> network: Settings of BaseNetworkAccessManager loaded. time=" 1.948" type="debug" -> network: Destroying Downloader instance. time=" 1.948" type="debug" -> network: Destroying SilentNetworkAccessManager instance. time=" 4.356" type="debug" -> CTRL is NOT pressed while sorting articles - sorting with standard mode. time=" 4.356" type="debug" -> Displaying messages from feeds IDs: ''1'' and URLs: 'https://www.youtube.com/feeds/videos.xml?channel_id=UCXuqSBlHAE6Xw-yeJA0Tunw'. time=" 4.358" type="debug" -> message-model: Repopulated model, SQL statement is now: 'SELECT Messages.id, Messages.is_read, Messages.is_important, Messages.is_deleted, Messages.is_pdeleted, Messages.feed, Messages.title, Messages.url, Messages.author, Messages.date_created, Messages.contents, Messages.enclosures, Messages.score, Messages.account_id, Messages.custom_id, Messages.custom_hash, Feeds.title, CASE WHEN length(Messages.enclosures) > 10 THEN 'true' ELSE 'false' END AS has_enclosures FROM Messages LEFT JOIN Feeds ON Messages.feed = Feeds.custom_id AND Messages.account_id = Feeds.account_id WHERE Feeds.custom_id IN ('1') AND Messages.is_deleted = 0 AND Messages.is_pdeleted = 0 AND Messages.account_id = 1 ORDER BY Messages.id DESC;'. time=" 4.365" type="debug" -> core: Filter accepts row 'User (RSS/ATOM/JSON)' and filter result is: 'true'. time=" 4.365" type="debug" -> core: Filter accepts row 'Linus Tech Tips' and filter result is: 'true'. time=" 4.366" type="debug" -> core: Filter accepts row 'Linus Tech Tips' and filter result is: 'true'. time=" 4.366" type="debug" -> core: Filter accepts row 'Labels' and filter result is: 'true'. time=" 4.367" type="debug" -> core: Filter accepts row 'Important articles' and filter result is: 'true'. time=" 4.367" type="debug" -> core: Filter accepts row 'Unread articles' and filter result is: 'true'. time=" 4.368" type="debug" -> core: Filter accepts row 'Recycle bin' and filter result is: 'true'. time=" 6.556" type="debug" -> gui: Message list got focus with reason 'Qt::MouseFocusReason'. time=" 6.557" type="debug" -> gui: Current row changed - proxy 'QModelIndex(6,6,0x26c0fc0bc60,MessagesProxyModel(0x26c07dc53b0, name = MessagesProxyModel))', source 'QModelIndex(6,6,0x0,MessagesModel(0x26c07d9e860))'. time=" 7.498" type="debug" -> gui: Hovered link: 'QUrl(https://i3.ytimg.com/vi/bafzQBSktwk/hqdefault.jpg)'. time=" 7.547" type="debug" -> gui: Hovered link: 'QUrl()'. time=" 13.410" type="debug" -> core: Cleaning up resources and saving application state. time=" 13.411" type="debug" -> core: Close lock was obtained safely. time=" 13.415" type="debug" -> feed-downloader: Destroying FeedDownloader instance. time=" 13.459" type="debug" -> gui: Destroying FormMain instance. time=" 13.463" type="debug" -> gui: Destroying TabWidget instance. time=" 13.464" type="debug" -> gui: Destroying FeedMessageViewer instance. time=" 13.465" type="debug" -> gui: Destroying BaseToolBar instance. time=" 13.467" type="debug" -> network: Destroying Downloader instance. time=" 13.467" type="debug" -> network: Destroying SilentNetworkAccessManager instance. time=" 13.468" type="debug" -> network: Destroying Downloader instance. time=" 13.468" type="debug" -> network: Destroying SilentNetworkAccessManager instance. time=" 13.469" type="debug" -> gui: Destroying MessagesView instance. time=" 13.470" type="debug" -> gui: Destroying BaseToolBar instance. time=" 13.470" type="debug" -> gui: Destroying FeedsView instance. time=" 13.471" type="debug" -> gui: Destroying TabBar instance. time=" 13.471" type="debug" -> gui: Destroying StatusBar instance. time=" 13.472" type="debug" -> gui: Destroying SystemTrayIcon instance. time=" 13.485" type="debug" -> core: Destroying Application instance. time=" 13.485" type="debug" -> core: Destroying Mutex instance. time=" 13.496" type="debug" -> gui: Destroying IconFactory instance. time=" 13.496" type="debug" -> core: Destroying FeedReader instance. time=" 13.497" type="debug" -> feed-model: Destroying FeedsModel instance. time=" 13.497" type="debug" -> feed-model: Destroying FeedsProxyModel instance time=" 13.498" type="debug" -> message-model: Destroying MessagesModel instance. time=" 13.498" type="debug" -> message-model: Destroying MessagesProxyModel instance.

Operating system and version

martinrotter commented 2 years ago

Yes, I reproduced it and note that several aspects of article contents comes into play. I made now these changes:

  1. These two options are now taken properly into account for non-web-engine simple article viewer too. image
  2. Your feed Youtube example actually did not directly contain pictures in its contents but does include "attachments" (called RSS enclosures) and webengine properly respected point 1 and embedded those attachments. Now non-web-engine does this too, I fixed hopefully the behavior.
  3. I also added automagic detection whether the contents of article is or is not HTML and converting it now to HTML, this should result in (much) nicer layout of article in 90+ % of feeds.

11b99604a