gitenberg-dev / giten_site

django repo for running the GITenberg website
http://www.gitenberg.org
41 stars 6 forks source link

xml parsing of external files #77

Open eshellman opened 6 years ago

eshellman commented 6 years ago

lxml is probably the thing to use here in place of the python default.

first install lxml.

then, instead of from xml.etree import ElementTree in external.py, use from lxml.etree import ElementTree, XMLParser

parser = XMLParser(recover=True) ElementTree.fromstring(opds.content, parser=parser)

bdr99 commented 6 years ago

Sorry for the delay, I was away without internet access for a few days. I've created a PR https://github.com/gitenberg-dev/giten_site/pull/78 to use lxml with the recover=True parser option. Hopefully this will help in the future if Standard Ebooks has more xml syntax errors.