When parsing the latest XKCD, I noticed that the alt text the latest entry
is mangled. It seems to be a bug dealing with entities in title/alt text
since the truncated string ends where there's an " in the text.
What steps will reproduce the problem?
1. Parse an article with links embedded in the content
2. View the parsed content.
What is the expected output? What do you see instead?
Unmangled title/alt text.
What version of the product are you using? On what operating system?
Feedparser 4.1 on Gentoo
Please provide any additional information below.
I've attached a copy of the misbehaving XML, parsing it will demonstrate
the problem with what turns into ["entries"][0]. Where it should say "They
could say "the connection is probably lost," but it's
more fun to do naive time-averaging to give you hope that if you wait
around for 1,163 hours, it will finally finish." It only says "They could say"
Original issue reported on code.google.com by TheMo...@gmail.com on 20 Jul 2009 at 6:48
Original issue reported on code.google.com by
TheMo...@gmail.com
on 20 Jul 2009 at 6:48Attachments: