HaveF / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

importing lxml.etree changes what exceptions libxml2 throws #352

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Use Flexget 1.0r2672 (and others) with libxml2 installed
2.
3.

What is the expected output? What do you see instead?
No expected error. The error message I get is: "SAXException: Read error (no 
details available)"

What version of the product are you using? On what operating system?
Arch Linux.

Please provide any additional information below.

An error report at Flexget with details and error messages is available here: 
http://flexget.com/ticket/1446

I've had issues which seem to originate with feedparser when i have libxml2 
installed. The application I'm using which requires FeedParser is Flexget. 
Another application I use requires libxml2, so I can't simply uninstall it.

I'm currently able to reproduce the error inside Flexget, but not while
using feedparser in the shell directly. What happens in Flexget is very
strange.

I have several RSS feeds that are parsed looking at TV shows, and they
don't cause any errors. Then, when parsing on of the following 3 feeds
for movies, the SAXException occurs. However, if I run Flexget on just
these movies, it works fine. That is, the error occurs only after
parsing the TV feeds THEN the movies feeds.

TV 
Feeds:http://www.torlock.com/television/rss.xmlhttp://torrentz.eu/feed_verified?
q=tvhttp://ezrss.it/feed/http://showrss.karmorra.info/feeds/all.rsshttp://rss.bt
-chat.com/?group=3http://rss.thepiratebay.org/208

Movies 
feeds:http://www.torlock.com/movies/rss.xmlhttps://torrentz.eu/feed_verified?q=m
ovieshttps://rss.thepiratebay.org/207

Flexget uses the requests library to fetch the data before passing it to
feedparser, so it's possibly some combination of those two, and
drv_libxml2.

Original issue reported on code.google.com by BluePhoe...@gmail.com on 7 May 2012 at 3:20

GoogleCodeExporter commented 9 years ago
Thanks for opening this ticket! I noticed that one of the posters on the 
Flexget issue tracker, lazybones, was using a 64-bit platform. Are you running 
32-bit or 64-bit? And would you provide me with the exact versions of the 
software that might be involved with this? I'm looking specifically for the 
versions for:

Python
Flexget
feedparser
lxml
libxml2
requests

When I have that information I'll try installing everything. And for my future 
reference, here are the URLs of the TV feeds and Movies:

http://www.torlock.com/television/rss.xml
http://torrentz.eu/feed_verified?q=tv
http://ezrss.it/feed/
http://showrss.karmorra.info/feeds/all.rss
http://rss.bt-chat.com/?group=3
http://rss.thepiratebay.org/208

http://www.torlock.com/movies/rss.xml
https://torrentz.eu/feed_verified?q=movies
https://rss.thepiratebay.org/207

Original comment by kurtmckee on 7 May 2012 at 2:34

GoogleCodeExporter commented 9 years ago
I've identified that the problem is with the lxml project, but this problem can 
be dealt with by feedparser. I've reported the bug to the lxml project, and 
will create a unit test and a fix soon.

https://bugs.launchpad.net/lxml/+bug/1001301

Original comment by kurtmckee on 18 May 2012 at 3:37

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r710.

Original comment by kurtmckee on 23 May 2012 at 4:39

GoogleCodeExporter commented 9 years ago
I've fixed the problem at the feedparser level by catching SAXException (which 
is the parent class of SAXParseException). Illformed feeds will then be parsed 
with the SGML parser. I'll also update the FlexGet ticket so the devs can 
decide how to handle it.

Thanks for reporting this!

Original comment by kurtmckee on 23 May 2012 at 4:45

GoogleCodeExporter commented 9 years ago
Any chance for a release with this fix included?

Original comment by chase.sterling@gmail.com on 16 Oct 2012 at 3:01

GoogleCodeExporter commented 9 years ago
I am seeing the same exception even after I uninstalled lxml.  The cause may be 
the same as in issue:211. Please  make a new release with the fix you mentioned 
5 months ago.

Original comment by b...@atmaildot.com on 2 Nov 2012 at 6:04