xqwzts / feedparser

A Dart library for parsing RSS feeds
MIT License
17 stars 16 forks source link

Parsing Robustness on YouTube and Blogger RSS feeds #1

Open daftspaniel opened 6 years ago

daftspaniel commented 6 years ago

Finding the package working well for 'valid' feeds with XML.

However having issues with some popular RSS providers: http://returnofthebeast.blogspot.com/feeds/posts/default?alt=rss https://www.youtube.com/feeds/videos.xml?channel_id=UCspFbd1b1vwuhsj0ZVehd2w

Issue is more with the underlying XML package rather than feedparser going by the stacktrace:

SEVERE 2018-02-02 20:49:59.507736 Invalid argument(s): </ expected at 11:1 SEVERE 2018-02-02 20:49:59.508591 #0 parse (package:xml/xml.dart:42:5)

1 parse (package:feedparser/src/parser.dart:20:32)

2 RssWatcher.parseFeed (package:dplanet/rss/rsswatcher.dart:101:20)

3 RssWatcher.checkForUpdate (package:dplanet/rss/rsswatcher.dart:74:25)

#4 RssWatcher.checkAll (package:dplanet/rss/rsswatcher.dart:54:7) #5 main (file:///home/daftspaniel/development/WebstormProjects/dplanet/bin/main.dart:12:14) #6 _startIsolate. (dart:isolate-patch/isolate_patch.dart:263) #7 _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:151) Thanks!
xqwzts commented 6 years ago

Hi @daftspaniel, thanks for reporting these.

I wasn't able to reproduce an error parsing the blogspot feed, how are you passing the feed string to feedparser?

The Youtube feed fails for me as well. feedparser is mostly limited to RSS 2.0 feeds for the moment, only because those were what I'm parsing in the project I initially built feedparser for. Youtube seems to implement its own schema [most importantly skipping the channel tag which is the only tag feedparser considers as mandatory].

I'll take a look at what it would take to add youtube support, PRs welcome of course!