Closed jacobthill closed 4 years ago
Cannot reproduce this because the OAI interface is restricted. I suspect that the interface returns an empty response, something like:
</>
Sickle uses an XML parser that forgives some flaws in the XML structure. This response will cause the parsed result to be None
:
>>> XMLParser = etree.XMLParser(remove_blank_text=True, recover=True, resolve_entities=False)
>>> type(etree.XML('</>', parser=XMLParser)
NoneType
I am unsure what the problem is but I keep getting the following error when trying to harvest a collection from Qatar Digital Library. I have to harvest through a whitelisted server, so unfortunately, no one will be able to test but I'm hoping someone has a better instinct about why I'm getting this error and, more importantly, how to avoid it. The last time I harvested these records there were more that 32k but I keep getting this error on number 18,108. I would like to just pass over this record (and any other record with a similar problem) and harvest the rest of them but the script always stops on this record. Here is the complete error message:
Here is my script: