Open james-skelton opened 2 years ago
The implications of this are that:
Previous versions of UniProt don't seem to have the separate files for humans, so I've been trying to generate these myself.
Unfortunately, the TREMBL file for the 2022_02 release of UniProt encounters an unexpected EOF after parsing ~133,000,000 records. I'm not sure why this is the case, but it's quite slow to debug. I'm considering downloading the UniProt XML file to use for the UniProt parser, however this will involve rewriting the logic as the properties parsed and their names differ.
Biopython is unable to parse the latest version of UniProt. This is a known issue and is due to a change in the feature (FT) lines of the UniProt file.
See issue #4021 from the biopython GitHub repository: https://github.com/biopython/biopython/issues/4021