Open ptrailblazer89 opened 2 days ago
Thanks for sharing these examples @ptrailblazer89.
I've added image support to the feed. By the way, this site provides its own RSS feed that should be much faster but also lacks images. Might want to give that a shot.
I see the issue here. The first entry (PCB Design and UI for a Canbus Motorsports Device) does not have a discrete date, so by default it is given today's date. "Last week" is too amorphous for Diffbot Extract to tell what day it is. I can't fix the date interpretation unfortunately, but this seems to only affect one of the entries.
We'll review this in the base extraction model. Will report back when it's fixed.
Thanks for Quick Response ... Image Problem Still Persists
Feeder Android app unable to decode Images https://play.google.com/store/apps/details?id=com.nononsenseapps.feeder.play&hl=en_IN&pli=1 https://github.com/spacecowboy/Feeder
TEST SITE: https://blogs.blackberry.com/en/home
The Below is feed generated from: https://rss.diffbot.com/atom?url=https://blogs.blackberry.com/en/home
<?xml version='1.0' encoding='UTF-8'?>
<feed xmlns="http://www.w3.org/2005/Atom"><id>https://blogs.blackberry.com/en/home</id><title>BlackBerry Blogs</title><updated>2024-09-17T19:23:27.130771+00:00</updated><link href="https://rss.diffbot.com/rss?url=https://blogs.blackberry.com/en/home" rel="self"/><link href="https://blogs.blackberry.com/en/home" rel="alternate"/><generator uri="https://lkiesow.github.io/python-feedgen" version="1.0.0">python-feedgen</generator><icon>https://blogs.blackberry.com/etc.clientlibs/bb-spa-react/clientlibs/clientlib-react/resources/logo192.png</icon><subtitle>https://blogs.blackberry.com/en/home</subtitle>
<entry>
<id>https://blogs.blackberry.com/en/2024/09/top-multi-tenancy-console-cylance</id>
<title>Elevate Your IT Operations with the Updated Cylance Multi-Tenant Console</title>
<updated>2024-09-17T19:23:27.134618+00:00</updated>
<content>Announcing powerful updates we recently unveiled in the Cylance Multi-Tenant Console (MTC). It's the next step toward the future of IT Management.</content>
<link href="https://blogs.blackberry.com/en/2024/09/top-multi-tenancy-console-cylance"/>
<link href="https://images.blackberry.com/is/image/blackberry/multi-tenant-thumb-466x261?wid=466&fmt=jpg"/>
<published>2024-09-13T00:00:00+00:00</published></entry>
<entry>
<id>https://blogs.blackberry.com/en/2024/09/memory-threat-detection</id>
<title>Detecting Threats in Memory: The Role of Advanced Sensors</title>
<updated>2024-09-17T19:23:27.134249+00:00</updated>
<content>Traditional methods often fail to detect memory-based cyberattacks. Advanced sensors that monitor and analyze memory are key to closing this gap.</content>
<link href="https://blogs.blackberry.com/en/2024/09/memory-threat-detection"/>
<link href="https://images.blackberry.com/is/image/blackberry/memory-attack-thumb-466x261?wid=466&fmt=jpg"/>
<published>2024-09-12T00:00:00+00:00</published>
</entry>
The Below is feed from : https://phys.org/rss-feed/
<rss xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<title>Phys.org - latest science and technology news stories</title>
<link>https://phys.org/</link>
<language>en-us</language>
<description>Phys.org internet news portal provides the latest news on science including: Physics, Nanotechnology, Life Sciences, Space Science, Earth Science, Environment, Health and Medicine.</description>
<item>
<title>Europa Clipper: 8 things to know about NASA's mission to an ocean moon of Jupiter</title>
<description>The first NASA spacecraft dedicated to studying an ocean world beyond Earth, Europa Clipper aims to find out whether the ice-encased moon Europa could be habitable.</description>
<link>https://phys.org/news/2024-09-europa-clipper-nasa-mission-ocean.html</link>
<category>Space Exploration Planetary Sciences </category>
<pubDate>Tue, 17 Sep 2024 15:26:04 EDT</pubDate>
<guid isPermaLink="false">news645805562</guid>
<media:thumbnail url="https://scx1.b-cdn.net/csz/news/tmb/2024/8-things-to-know-about.jpg" width="90" height="90"/>
</item>
<item>
<title>Lord Kelvin: How the 19th century scientist combined research and innovation to change the world</title>
<description>"What got you into astrophysics?" It's a question I'm often asked at outreach events, and I answer by pointing to my early passion for exploring the biggest questions about our universe. Well, along with seeing Star Wars at an impressionable age.</description>
<link>https://phys.org/news/2024-09-lord-kelvin-19th-century-scientist.html</link>
<category>General Physics </category>
<pubDate>Tue, 17 Sep 2024 15:23:04 EDT</pubDate>
<guid isPermaLink="false">news645805382</guid>
<media:thumbnail url="https://scx1.b-cdn.net/csz/news/tmb/2024/kelvin.jpg" width="90" height="90"/>
</item>
Looks Like Different Formating
Hello, Diffbot,
The Service Helps a Lot in providing RSS Feeds for websites which dont have one, but following problems were noted
Blog Images arent Captured (https://politepol.com/en/ ...... does this well) which give insight into Article in a Glance https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/bg-p/blog-ai
Not everything is Captured (and chronologically) https://www.upwork.com/nx/search/jobs/?q=embedded
Sometimes Garbage Gets Captured along with feed (Sometimes nothing gets captured at all) (or title wont get captured correctly..Dates are put for Titles and Title as Text...https://www.quectel.com/blog/)