libo26 / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

Truncated link attributes #178

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
When parsing the latest XKCD, I noticed that the alt text the latest entry
is mangled. It seems to be a bug dealing with entities in title/alt text
since the truncated string ends where there's an " in the text.

What steps will reproduce the problem?
1. Parse an article with links embedded in the content
2. View the parsed content.

What is the expected output? What do you see instead?

Unmangled title/alt text.

What version of the product are you using? On what operating system?

Feedparser 4.1 on Gentoo

Please provide any additional information below.

I've attached a copy of the misbehaving XML, parsing it will demonstrate
the problem with what turns into ["entries"][0]. Where it should say "They
could say "the connection is probably lost," but it's
more fun to do naive time-averaging to give you hope that if you wait
around for 1,163 hours, it will finally finish." It only says "They could say"

Original issue reported on code.google.com by TheMo...@gmail.com on 20 Jul 2009 at 6:48

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by adewale on 27 Dec 2009 at 11:59