lemon24 / reader

A Python feed reader library.
https://reader.readthedocs.io
BSD 3-Clause "New" or "Revised" License
456 stars 38 forks source link

Store entry source #276

Open SorenDalsgaard opened 2 years ago

SorenDalsgaard commented 2 years ago

Hi

First of thanks for building this great tool.

I am having a small issue with understanding how to store elements from a feed item. For this feed https://www.newsdesk.lexisnexis.com/feed/aeb9dae6799dcbd4.rss I am interested in storing the source element.

How could I store this for each entry in the feed? Thanks

lemon24 commented 2 years ago

Hi, thanks for reaching out!

source doesn't seems to be stored at the moment; reader started out using a subset of the stuff feedparser exposes, and I kept adding more and more fields as I needed them.

I think source should be relatively easy to add. I can't promise I can get to it anytime soon, but I will prioritize reviewing a pull request for it.

lemon24 commented 2 years ago

Some notes on implementing this.

The RSS source just points to the original feed:

<source url="http://example.com/feed.xml">Some Feed</source>

The Atom source is way more complicated, containing a subset of the feed elements.

feedparser represents the RSS source element like this:

{'href': 'http://example.com/feed.xml', 'title': 'Some Feed'}

Note the href is not a documented attribute of the feedparser entries[i].source (but is accessible as source.url, because source is a FeedParserDict). The Atom equivalent of href would then be link[rel=self], and the reader equivalent url (per #153, and for similarity with Feed).

It should be pretty safe to make Entry.source have a subset of the Feed attributes:

class Entry:
    ...
    source: Optional[EntrySource]

class EntrySource:
    url: str  # RSS href/url; Atom link[rel=self]
    updated: Optional[datetime] = None
    title: Optional[str] = None
    link: Optional[str] = None
    author: Optional[str] = None
    subtitle: Optional[str] = None
lemon24 commented 7 months ago

2024 implementation notes:

lemon24 commented 1 day ago

Additional implementation notes (in light of #290):