custom-components / feedparser

📰 RSS Feed Integration
MIT License
143 stars 32 forks source link

Many warnings like "Unable to parse RFC-822 date from 2024-02-01T00:00:01Z." #127

Open wralb opened 7 months ago

wralb commented 7 months ago

Using Home Assistant 2024.03.0 in python 3.12 venv on Ubuntu LTS on VM in XCP-NG on intel server

Since upgrading from Feedparser release version to beta 0.2.0.b7 I get many warning like the one below for several feeds:

2024-03-13 08:49:37.703 WARNING (SyncWorker_13) [custom_components.feedparser.sensor] Feed Home Assistant: Unable to parse RFC-822 date from 2024-02-01T00:00:01Z. This could be caused by incorrect pubDate format in the RSS feed or due to a leapp second

Note the period character after "Z" in the date string in the warning message.

A good feed to test with is https://alerts.home-assistant.io/feed.xml. This feed has entries like below with valid W3CDTF (UTC timezone) datetimes (and no 'period character'):

<title>Ambiclimate integration will stop working April 1, 2024</title>
<link rel="alternate" href="https://alerts.home-assistant.io/alerts/ambiclimate/"/>
<updated>2024-02-01T00:00:01Z</updated>
<published>2024-02-01T00:00:01Z</published>
<id>https://alerts.home-assistant.io/alerts/ambiclimate/</id>

An extract from the resulting sensor looks like:

- title: Ambiclimate integration will stop working April 1, 2024
  title_detail:
    type: text/plain
    language: null
    base: https://alerts.home-assistant.io
    value: Ambiclimate integration will stop working April 1, 2024
  links:
    - rel: alternate
      href: https://alerts.home-assistant.io/alerts/ambiclimate/
      type: text/html
  link: https://alerts.home-assistant.io/alerts/ambiclimate/
  updated: 2024-02-01T00:00:01Z.
  published: 2024-02-01T00:00:01Z.
  id: https://alerts.home-assistant.io/alerts/ambiclimate/
  guidislink: false

Note, again, the period character in the dates in the above sensor. Is this correct?

I've tried three things, none of which help:

  1. Adding this to config: 'date_format: '%Y-%m-%dT%H:%M:%SZ.' (period character in strftime definition)
  2. Adding this to config: 'date_format: '%Y-%m-%dT%H:%M:%SZ' (no period character in strftime definition)
  3. Removing date_format from config
ogajduse commented 7 months ago

@wralb Thank you for reporting it. TIL that Atom Syndication Format exists. Since the Home Assistant Alerts feed is published in this format, the code of the feedparser integration can not assume that all feeds are RSS 2.0 compliant, it also needs to work with the Atom feeds. I have a working implementation of time parsing already, but I still need to test it and do a new release. I would like to get to it in a few days. I will keep you posted.