Closed HobnobMancer closed 3 months ago
The issue is still persisting, with failing to parse incomplete XML files downloaded from NCBI during the retrieval of NCBI taxonomies --> see #124 and #125
An IncompleteRead
and a CorruptedXMLError
need to be added to the try/excepts on lines 201 and 204 in cazy_webscraper/ncbi/taxonomy/multiple_taxa.py
Have this bug been fixed?
I haven't been able to replicate this error myself. As far as I can tell it should be fixed in the latest version (2.3.0.2, and on branch issue_120_ncbi
). I haven't had another chance to look at this until today.
Later today version 2.3.0.3 will be released, with a try/except for incomplete and corrupt reads from NCBI on every call to NCBI to try and help with the problem. All cazy_webscraper
will be able to do is retry connecting to NCBI, but if there are persistent incomplete/corrupted reads that is mostly likely an issue with connection closing prematurely, which is independent of cazy_webscraper
.
This issue should now be addressed in version 2.3.0.3 - see release notes.
I'll leave this issue open for a while in case the issue persists.
As the issue seems to have been resolved I will close this issue. If the issue persists then please feel free to open this issue.
Please complete this report in full and as much detail as possible. It will help with getting the bug fixed far sooner!
Describe the bug
A clear and concise description of what the bug is. Please include what you are trying to get the tool to do?
cazy_webscraper crashes when receiving an incomplete read from NCBI, while downloading the latest taxonomy data for records with multiple taxa in CAZy.
To Reproduce
Please include the specific steps (including all code) you performed, so that we can check if the behaviour can be reproduced:
cazy_webscraper email -d database.db
Error:
Expected behavior
Catch incomplete read error and parse.