In my feed database, 16% of the RSS and Atom feeds don't have it.
This is fine, .updated is optional, but it is not useful: I would like to know when did this feed get a new/updated entry?.
We can't use .last_updated, because that includes the last check on not modified feeds (when did reader try to update this feed last?).
Two options come to mind, both involving a new Feed attribute that contains max(max(e.updated, e.published) for e in entries); the difference is what entries are considered when calculating this:
All the entries the feed has stored. This is closer to the truth, but can include false alarms (e.g. entry published in the future that got removed from the feed, but we still have it in the database).
All the entries in the feed at parse time. This seems more useful, since it's more likely to reflects the intention of the feed author (obviously, it assumes the feed includes the most recent entries).
Some feeds have
.updated = None
:This is fine,
.updated
is optional, but it is not useful: I would like to know when did this feed get a new/updated entry?.We can't use
.last_updated
, because that includes the last check on not modified feeds (when did reader try to update this feed last?).Two options come to mind, both involving a new
Feed
attribute that containsmax(max(e.updated, e.published) for e in entries)
; the difference is what entries are considered when calculating this: