Open GoogleCodeExporter opened 9 years ago
Actually the problem is caused by xml.sax not being available.
I encountered this bug in the Deluge torrent client which includes a limited
version of Python 2.6 so _XML_AVAILABLE is set to 0 when i fails to import
xml.sax.
The problem is caused by this code:
# query variables in urls in link elements are improperly
# converted from `?a=1&b=2` to `?a=1&b;=2` as if they're
# unhandled character references. fix this special case.
output = re.sub("&([A-Za-z0-9_]+);", "&\g<1>", output)
When the URL looks like this:
http://somehost.net/torrents.php?action=download&id=9631475
This code converts in into this:
http://somehost.net/torrents.php?action=download&id=9631475
As I'm not entirely sure which cases this code is supposed to fix, I added a
bulletproof fix by placing this line above the re.sub above
output = re.sub("&", "&", output)
Reproduce issue by raising ImportError here:
try:
raise ImportError()
import xml.sax
from xml.sax.saxutils import escape as _xmlescape
except ImportError:
_XML_AVAILABLE = 0
And try parsing the following rss file:
<rss version="2.0">
<channel>
<title>Site tile</title>
<link>Site url</link>
<description>Description of RSS Feed</description>
<language>en-us</language>
<ttl>120</ttl> <item>
<title>Some title</title>
<link>http://hostname.com/Fetch?hash=2f21d4e59&digest=865178f9bc</link>
<guid isPermaLink="true">http://hostname.com/Fetch?hash=2f21d4e59&digest=865178f9bc</guid>
<comments>Some comment</comments>
<pubDate>Thu, 15 May 2012 00:16:18 +0000</pubDate>
<description>Detailed description</description>
<enclosure url="http://hostname.com/Fetch?hash=2f21d4e59&id=865178f9bc" length="3423659" type="application/x-bittorrent"/>
</item>
</channel>
</rss>
Original comment by bendi...@gmail.com
on 17 May 2012 at 11:51
Original issue reported on code.google.com by
bendi...@gmail.com
on 17 May 2012 at 10:43