Open GoogleCodeExporter opened 9 years ago
I downloaded the stock 4.1 release and tried the code you listed above.
'dc_value' is a string, not a dict, so while 4.1 gives you the content, you get
none of the element's attributes. I then tried the same thing with svn trunk
and found that the element attributes are available (exactly as you noted
above) but the element content isn't.
feedparser 4.1 was released five years ago, and the namespace code changed over
time. The documentation has to be updated, no question, but I'll review the
current behavior and the old documentation and see if there's an obvious
solution. In the mean time, you may be able to mitigate the problem by forcing
feedparser's behavior to your liking using code similar to:
import feedparser
# this will override unknown element behavior
# include the 'self' parameter
def start_dc_value(self, attrsD):
self.pushContent('dc_value', attrsD, 'text/plain', 1)
def end_dc_value(self):
value = self.popContent('dc_value')
context = self._getContext()
context['dc_value'] = value
# insert the new functions to override current behavior
feedparser._FeedParserMixin._start_dc_value = start_dc_value
feedparser._FeedParserMixin._end_dc_value = end_dc_value
f = feedparser.parse('http://xurrency.com/gbp/feed')
e = f.entries[0]
print e['dc_value'] # prints a string like '1.1869'
Original comment by kurtmckee
on 20 Feb 2011 at 11:32
I have the same issue when parsing an equivalent RSS feed. I have something like
<tag attr1="foo" attr2="bar">baz</tag>
And I want to be able to read the attrs and the content (baz).
The fix you provided did not seem to change anything. I did a json dump of the
return value of feedparser.parse both before and after overriding the behavior
and diffed them, and there was no diff.
(I'm using 5.0.1)
Original comment by danj...@gmail.com
on 5 Sep 2011 at 6:41
Oh, so I was in blind copy-and-paste mode and didn't realize that that was
actually particular to the name of his tag. Now that I did
s/dc_value/my_tag_name/ it worked. But I have several tags like this I want to
fix. Is there no general solution?
Original comment by danj...@gmail.com
on 5 Sep 2011 at 6:47
I haven't determined the best way to handle this yet, so there isn't yet a
convenient general solution.
Original comment by kurtmckee
on 6 Sep 2011 at 3:14
Issue 301 has been merged into this issue.
Original comment by kurtmckee
on 9 Sep 2011 at 2:06
Original comment by kurtmckee
on 9 Sep 2011 at 2:07
I am the same issue.
http://itunes.apple.com/us/rss/topfreeapplications/limit=10/xml
Line like this
<im:artist
href="http://itunes.apple.com/us/artist/fluik/id341885018?mt=8&uo=2">Fluik</im:a
rtist>
is fucked up, I can't get "Fluik" from this xml element.
Original comment by electron...@gmail.com
on 26 Sep 2011 at 6:53
Issue 341 has been merged into this issue.
Original comment by kurtmckee
on 9 Apr 2012 at 4:27
[deleted comment]
I'm having issues with this as well.
Would like a fix for this.
Original comment by kwh...@gmail.com
on 12 Dec 2012 at 8:02
Issue 420 has been merged into this issue.
Original comment by kurtmckee
on 10 Jul 2014 at 4:42
Original issue reported on code.google.com by
kylemacfarlane@gmail.com
on 20 Feb 2011 at 10:21