pombreda / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

Problem with parsing Media RSS of Yahoo #188

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.
    >>> news_rss_url = "http://rss.ent.yahoo.com/movies/thisweek.xml"
    >>> import feedparser
    >>> info = feedparser.parse(news_rss_url)
    >>> info.entries[0].media_thumbnail
    u''

What is the expected output? What do you see instead?
Dictionary of this node 
<media:thumbnail height="74"
url="http://l.yimg.com/eb/ymv/us/img/hv/photo/movie_pix/ifc_films/afterschool/af
terschool_smallposter-th.jpg"
width="50"/>

something like  this 
{
'node_name' : 'thumbnail',
'height' : 74,
'url' :
"http://l.yimg.com/eb/ymv/us/img/hv/photo/movie_pix/ifc_films/afterschool/afters
chool_smallposter-th.jpg",
'width' : 50,
}

What version of the product are you using? On what operating system?
feedparser 4.1
Windows vista

Please provide any additional information below.

basically i am talking about two points here
1.I've faced problems while reading attributes of RSS namespaces
2.Currently if we have multiple nodes as follows...

      <media:category scheme="111">AAA</media:category>
      <media:category scheme="222">BBB</media:category>
      <media:category scheme="333">CCC</media:category>

then 
 >>> info.entries[0].media_category 
outputs,
u'CCC'

it should returned disctionary

Original issue reported on code.google.com by prashantchaudharry@gmail.com on 1 Oct 2009 at 3:23

GoogleCodeExporter commented 9 years ago
sorry for some typo.. 
> it should returned disctionary
>* it should have returned a dictionary..

This would solve problems in parsing if this issue gets fixed..

Thanks,
Prashant C

Original comment by prashantchaudharry@gmail.com on 1 Oct 2009 at 3:27

GoogleCodeExporter commented 9 years ago
>>> import feedparser
>>> news_rss_url = "http://rss.ent.yahoo.com/movies/thisweek.xml"
>>> f = feedparser.parse(news_rss_url)
>>> f.entries[0].media_thumbnail
[{'url': 
u'http://l.yimg.com/eb/ymv/us/img/hv/photo/movie_pix/sony_pictures_classics/th
e_white_ribbon/thewhiteribbon_smallposter-th.jpg', 'width': u'50', 'height': 
u'74'}]

The above code does what you want with the current version of the codebase.
We now preserve attributes in namespaced elements.

Please file a separate bug about the lack of support for repeated namespaced 
elements

Original comment by adewale on 27 Dec 2009 at 11:57

GoogleCodeExporter commented 9 years ago
Thanks :)

Original comment by prashantchaudharry@gmail.com on 28 Dec 2009 at 9:42