Closed jakebeal closed 2 years ago
Looks like the issue is triggered by annotations on a ComponentDefinition from the NCBI namespace (http://www.ncbi.nlm.nih.gov/genbank
).
A workaround for this bug is to simply strip all annotations from ComponentDefinitions
This is the kludge that I am using in Python:
keepers = {'http://sbols.org/v2', 'http://www.w3.org/ns/prov', 'http://purl.org/dc/terms/',
'http://sboltools.org/backport'}
for c in doc2.componentDefinitions: # wipe out all annotation properties
c.properties = {p:v for p,v in c.properties.items() if any(k for k in keepers if p.startswith(k))}
The issue with this is that it appears these SBOL records were converted from GenBank in the first place. The references in the GenBank file are not of the standard form, but rather just point to a URL, for example:
The converter was not expecting to see this, since references in GenBank look like this:
REFERENCE 1 (bases 1 to 5028) AUTHORS Torpey,L.E., Gibbs,P.E., Nelson,J. and Lawrence,C.W. TITLE Cloning and sequence of REV7, a gene whose function is required for DNA damage-induced mutagenesis in Saccharomyces cerevisiae JOURNAL Yeast 10 (11), 1503-1509 (1994) PUBMED 7871890
https://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html
Unfortunately, this is another example of the non-standard nature of GenBank.
In any case, a simple fix is to skip over these in the conversion back to GenBank.
I believe this is now fixed.
Attempting to convert this file: error.xml.txt to GenBank, causes libSBOLj's conversion routine to crash opaquely, even though it believes the file is valid.
Error report: