SynBioDex / pySBOL2

A pure Python implementation of the SBOL standard.
Apache License 2.0
20 stars 6 forks source link

Copy function doesn't remap namespaces as expected, breaks links #413

Open jakebeal opened 2 years ago

jakebeal commented 2 years ago

Consider the attached SBOL2 file: lycopene.xml.txt

The following sequence has namespace remapping behavior that is either incorrect or incorrectly used due to lack of documentation:

import sbol2

doc1 = sbol2.Document()
doc1.read('lycopene.xml')

print(doc1.componentDefinitions[0].identity)
# http://liverpool.ac.uk/ComponentDefinition/P21684_10000_gene/1

doc2 = sbol2.Document()
doc1.copy(target_doc=doc2) # no namespace activity, no problems
print(doc2.componentDefinitions[0].identity)
# http://liverpool.ac.uk/ComponentDefinition/P21684_10000_gene/1

doc3 = sbol2.Document()
doc1.copy('http://liverpool.ac.uk', doc3)  # should be a no-op for materials already in this namespace
print(doc3.componentDefinitions[0].identity)
# http://examples.org/ComponentDefinition/P21684_10000_gene/1

doc4 = sbol2.Document()
doc1.copy('http://somethingelse.org', doc4) 
print(doc4.componentDefinitions[0].identity)
# http://liverpool.ac.uk/ComponentDefinition/P21684_10000_gene/1

Links to sequences get broken with namespaces too. Presumably other links are being broken as well, but these were the ones that appeared:

print(doc1.componentDefinitions[0].sequence.identity)
# http://examples.org/Sequence/P21684_10000_gene_seq/1
print(doc2.componentDefinitions[0].sequence)
# http://examples.org/Sequence/P21684_10000_gene_seq/1
print(doc3.componentDefinitions[0].sequence)
# None
print(doc4.componentDefinitions[0].sequence)
# None