Closed eyesmo closed 2 years ago
Hi @eyesmo thanks for trying out pysbol!
One thing you can try before calling doc.write
is to call print(doc.validate())
to see if you can identify problems with the Document
An easy fix I would suggest trying is to simply replace pysbol with https://github.com/SynBioDex/pySBOL2. pySBOL2 is implemented natively in Python rather than C++ so it is more stable than pySBOL. The APIs are exactly the same, so your code should work as is.
Thanks for the suggestions! I've switched to using pySBOL2.
print(doc.validate())
throws the following strong validation error:
Invalid. sbol-10902:� Strong Validation Error:� The locations property of a SequenceAnnotation is REQUIRED and MUST contain a non-empty set of Location objects. �Reference: SBOL Version 2.3.0 Section 7.7.4 on page 32 :� http://sys-bio.org/ComponentDefinition/AF_SpisPink_Cassette/sPinkF/1� Validation failed.
So I think the problem is arising when I attempt to add additional SequenceAnnotations with specific Locations to the ComponentDefinition for the compiled design. The Location (and Range?) objects I try to create and assign to the SequenceAnnotation aren't being added correctly.
Here's an example of how I'm trying to add an annotation for a primer sequence. Does anything immediately pop out as incorrectly done?
#Now add the other primers.
#Define the SequenceAnnotation object
sPinkF_Primer = SequenceAnnotation('sPinkF')
#Define which ComponentDefinition the SequenceAnnotation refers to
sPinkF_Primer.component = 'AF_SpisPink_Cassette'
#Define a Range object for where the SequenceAnnotation will go on its component
sPinkF_PrimerRange = Range('sPinkF')
sPinkF_PrimerRange.start = 116
sPinkF_PrimerRange.end = 141
#Define a Location object which will hold the Range start/end information
sPinkF_PrimerLoc = Location('sPinkF')
sPinkF_PrimerLoc.Range = sPinkF_PrimerRange
#Define an orientation for the Location (top strand or bottom strand)
sPinkF_PrimerLoc.orientation = SBOL_ORIENTATION_INLINE
#Now define the location of the testPrimer sequence annotation, using testPrimerLoc
sPinkF_Primer.locations.add(sPinkF_PrimerLoc)
print(sPinkF_PrimerLoc)
print(sPinkF_Primer.locations)
#And for reference purposes, define the sequence in the testPrimer SequenceAnnotation
sPinkF_Primer.sequence = 'AAGCTCTTCATCCAATGTCGCACTCAAAACAAGCACTGG' #Note that this primer has some extra non-complementary sequence on its 5' end
#And link to the Benchling file from which this design was originally derived
sPinkF_Primer.wasDerivedFrom = 'https://benchling.com/openbioeconomy/f/lib_RSHKnK2W-destination-vector/seq_MMuUpcqh-af_spispink_cassette/edit'
#I think this works as a way to add annotations?
AF_SpisPink_Cassette.sequenceAnnotations.add(sPinkF_Primer)
These lines are potentially problematic:
#Define a Range object for where the SequenceAnnotation will go on its component
sPinkF_PrimerRange = Range('sPinkF')
sPinkF_PrimerRange.start = 116
sPinkF_PrimerRange.end = 141
#Define a Location object which will hold the Range start/end information
sPinkF_PrimerLoc = Location('sPinkF')
sPinkF_PrimerLoc.Range = sPinkF_PrimerRange
#Define an orientation for the Location (top strand or bottom strand)
sPinkF_PrimerLoc.orientation = SBOL_ORIENTATION_INLINE
#Now define the location of the testPrimer sequence annotation, using testPrimerLoc
sPinkF_Primer.locations.add(sPinkF_PrimerLoc)
And can be replaced with the following. Note that Range is a subclass of Location, you only need to create one or the other type of object, not both:
#Define a Range object for where the SequenceAnnotation will go on its component
sPinkF_PrimerRange = Range('sPinkF')
sPinkF_PrimerRange.start = 116
sPinkF_PrimerRange.end = 141
sPinkF_PrimerRange.orientation = SBOL_ORIENTATION_INLINE
#Now define the location of the testPrimer sequence annotation, using testPrimerLoc
sPinkF_Primer.locations.add(sPinkF_PrimerRange)
Thanks so much! That and a couple other tweaks got the whole design to be valid. One last question: how would you recommend visualizing a finished design, to check that the sequence is correct and features are correctly positioned? I tried exporting to genbank but the conversion was quite lossy when I opened it in Snapgene Viewer: the full sequence showed up, but all the annotations were labeled only with their roles (promoter, CDS, etc), and the name for each part was missing.
I also found Snapgene viewer can't open the .xml file that contains the complete SBOL design description.
Are there sequence viewing applications you'd recommend that can open SBOL .xml design files (does SBOLDesigner 2 also have Snapgene/Benchling-like sequence browsing and editing capabilities)? Alternatively, is there an export format or command you'd recommend that generally recovers both the component type and the component name (and maybe even the component description) in the exported file format?
Generally speaking there isn't great interoperability between sequence editors and SBOL tools...there's definitely a need. As for visualization tools, there aren't any great ones in Python. There are the https://sbolcanvas.org/canvas/ and http://visbol.org/ webtools for visualization. Also, I believe there is a SnapGene plug-in for Synbiohub. For more about this I'll refer you to @cjmyers . Chris, is there anything more you want to say about the SnapGene plug-in and how well conversion works between SnapGene and SBOL?
As there has been no activity on this issue for nearly a year, I'm going to close it. Please open a new issue if additional help is needed.
Hi! I'm working on learning to use pySBOL for my wetware design work. Towards that end, I've been trying to take some genetic designs stored in Benchling, and re-build them as SBOL documents with pySBOL, in a Colab notebook.
I've gotten to the 'last' step, of writing the design to a .xml or exporting the design to a .gb file. However, when I try to run either
result = doc.write(folderPath + 'AF_SpisPink_Cassette.xml')
or
doc.exportToFormat('GenBank', folderPath + 'AF_SpisPink_Cassette.gb')
the Colab notebook pops up a little window that says 'your session crashed for an unknown reason,' like so:
When I then click on "View runtime logs," here's what I see:
Below is the code I was working on to create the SBOL design. Apologies if it's ugly/wrong--I'm just in the early stages of learning this package and the SBOL data model.
As a side note, are there any repositories of example SBOL designs created and visualized with pySBOL and related packages, similar to the [sample plots for MatPlotLib]?
Thanks!
Code: