Closed GoogleCodeExporter closed 8 years ago
Does the epub validate with epubcheck? It shouldn't.
Bookworm is pretty lenient though, and this general case should already be
handled
(see line 839).
It's probably just throwing a different exception than the one we're already
catching. If you can find out the exact exception that the entity inclusion is
throwing, add it to the list in 839.
Original comment by liza31337@gmail.com
on 27 Aug 2009 at 7:13
> Does the epub validate with epubcheck? It shouldn't.
it probably doesn't - I removed the epubcheck check :o
> Bookworm is pretty lenient though, and this general case should already be
handled
> (see line 839).
> It's probably just throwing a different exception than the one we're already
> catching. If you can find out the exact exception that the entity inclusion is
> throwing, add it to the list in 839.
Yes, it does get caught on 839, but BeautifulSoup bails when it hits as
well, although I didn't follow
this through very far.
Will do some more testing and report back!
Original comment by steven.m...@gmail.com
on 27 Aug 2009 at 7:35
If we wanted a lot of entities:
http://www.oasis-open.org/docbook/specs/wd-docbook-xmlcharent-0.3.html
Original comment by abdela...@gmail.com
on 27 Aug 2009 at 1:35
BeautifulSoup should _absolutely_ be able to handle a real nbsp. Are you sure
it's
not typoed?
>>> import lxml.html.soupparser as parser
>>> x = parser.fromstring('<html><body>hello </body></html>')
>>> import lxml.etree as etree
>>> etree.tostring(x)
'<html><body>hello </body></html>'
Original comment by liza31337@gmail.com
on 27 Aug 2009 at 3:56
I think I must have got something else wrong on this, because running my tests
against the latest trunk shows
no problems with the - lxml.etree.XML fails but BeautifulSoup deals with it
just fine.
I think this can be closed - nothing to see, move along here :)
Original comment by steven.m...@gmail.com
on 27 Aug 2009 at 11:38
Original issue reported on code.google.com by
steven.m...@gmail.com
on 27 Aug 2009 at 6:59