kurtmckee / feedparser

Parse feeds in Python
https://feedparser.readthedocs.io
Other
1.98k stars 342 forks source link

Report RSS URL Direct Parsing Issue #408

Open bdim404 opened 1 year ago

bdim404 commented 1 year ago

I have encountered an issue when using the feedparser library to parse RSS directly from a URL. For example:

>>> feedparser.parse('https://hackernewsrss.com/feed.xml').keys()
dict_keys(['bozo', 'entries', 'feed', 'headers', 'bozo_exception'])
>>> d = feedparser.parse('https://hackernewsrss.com/feed.xml')
>>> d['feed']['title']

This results in the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/feedparser/util.py", line 113, in __getitem__
    return dict.__getitem(self, key)
KeyError: 'title'

However, when I download the XML file to my local system and parse it using feedparser's local file reading method, it works correctly, as shown below:

>>> d = feedparser.parse(r'./a.xml')
>>> d['feed']['title']
'Hacker News: New Comments'
>>> d['feed']['links']
[{'rel': 'alternate', 'type': 'text/html', 'href': 'https://news.ycombinator.com/newcomments'}]

I believe this may be a potential bug, as it should be possible to parse content directly from an RSS URL. I would appreciate it if this issue could be addressed. Thank you!

carltongibson commented 1 year ago

Works for me:

>>> import feedparser
>>> d = feedparser.parse('https://hackernewsrss.com/feed.xml')
>>> d.keys()
dict_keys(['bozo', 'entries', 'feed', 'headers', 'etag', 'href', 'status', 'encoding', 'version', 'namespaces'])
>>> d["feed"].keys()
dict_keys(['title', 'title_detail', 'subtitle', 'subtitle_detail', 'links', 'link', 'language', 'updated', 'updated_parsed', 'published', 'published_parsed', 'sy_updateperiod', 'image'])
>>> d["feed"]["title"]
'Hacker News RSS Feed'
>>>
bdim404 commented 8 months ago

Ok, I got it! Thanks for your reply!

alexscheelmeyer commented 8 months ago

I had a similar problem with 6.0.11, tried downgrading to 6.0.3 and the issue is no longer.