HaveF / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

Hex entities with an uppercase 'x' cause a crash. #324

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. <title>Forget Me&#X2026; Not</title>

What is the expected output? What do you see instead?

Should result in a …, but dies because it's only looking for a lowercase 'x'. 
I'm not certain if that's legal XML or not, but it certainly exists out there.

Please use labels and text to provide additional information.

Index: feedparser.py
===================================================================
--- feedparser.py   (revision 23507)
+++ feedparser.py   (working copy)
@@ -684,7 +684,7 @@
         if ref in ('34', '38', '39', '60', '62', 'x22', 'x26', 'x27', 'x3c', 'x3e'):
             text = '&#%s;' % ref
         else:
-            if ref[0] == 'x':
+            if ref[0].lower() == 'x': # kee, add lower
                 c = int(ref[1:], 16)
             else:
                 c = int(ref)
@@ -1901,7 +1901,7 @@
     def handle_charref(self, ref):
         # called for each character reference, e.g. for '&#160;', ref will be '160'
         # Reconstruct the original character reference.
-        if ref.startswith('x'):
+        if ref.lower().startswith('x'): # kee, add lower
             value = unichr(int(ref[1:],16))
         else:
             value = unichr(int(ref))

Original issue reported on code.google.com by keehinck...@gmail.com on 2 Feb 2012 at 3:45

GoogleCodeExporter commented 9 years ago
Ignore the revision #, that's from our internal source tree.

Original comment by keehinck...@gmail.com on 2 Feb 2012 at 3:45

GoogleCodeExporter commented 9 years ago
Could you check what version of the software you're using? I committed a patch 
that should have resolved this back in 2010. If you're using version 5.1 would 
you please attach a feed that demonstrates the problem?

Original comment by kurtmckee on 4 Feb 2012 at 8:12

GoogleCodeExporter commented 9 years ago

Original comment by kurtmckee on 11 Feb 2012 at 6:40

GoogleCodeExporter commented 9 years ago
I'm hoping to release a new version of feedparser this weekend, but if you 
respond back confirming you're seeing this behavior in the current release and 
have a feed that demonstrates the problem I'd love to incorporate a fix for it!

Original comment by kurtmckee on 25 Feb 2012 at 2:54

GoogleCodeExporter commented 9 years ago
If you find or create a feed that's demonstrating this issue, please leave a 
comment on this report and attach the file.

Original comment by kurtmckee on 7 Apr 2012 at 6:51