Open harirngd opened 1 year ago
The error is saying that the end of the sequence 8851 is incompatible with the assembly overhang selector dw.TmSegmentSelector(min_tm=55, max_tm=65)
, probably because the sequence's end is GC rich and so has high Tm. Note that DnaWeaver assumes that the end of the sequence will anneal with e.g. a receptor plasmid and so is subject to the min_tm/max_tm
constraint.
Some ideas to go around it:
max_tm
Thanks @Zulko . I was about to post that deleting the end made it succeed. Also putting the end at the beginning ( since it was circular) made it succeed . I am going to try specifying its a circular plasmid and also your approach of adding a 20bp adaptor. Wondering how to have it optimize it as a circular plasmid for the assembly?--is there a way where it optimized where to split the circular assembly?
I had a similar problem recently, where I needed to assemble a circular plasmid. But I used Golden Gate so the problem was finding overhangs with GoldenHinges (or design overhangs webapp). The solution was to set cutpoints upstream and downstream so the two ends make up about the desired fragment length, and ignore DNA ends. (Any ~100 bp segment should contain a suitable overhang.) Then I joined the two end fragments in a sequence editor. I think one key advantage of Golden Gate is that the efficiency is not sequence-dependent.
In your case, one option is to decide which section you want to use as 'backbone' and treat it separately. Alternatively, here is a discussion on DNA Weaver with circular sequences: https://github.com/Edinburgh-Genome-Foundry/DnaWeaver/issues/1
I'm not aware of an option that forces overlap region locations.
Also a comment, your Gibson segment lengths look a bit long:
min_segment_length=1000,
max_segment_length=4000
Maybe this is just an example of you trying out a few parameters. In any case, if we look at the user manuals on the Telesis Bio website (the company where Dr Gibson is the CTO), we see 40 bp overlap regions (page 10 PDF).
Thanks @Zulko and Peter .
@veghp I used the segment lengths from one of the example data , which has 1000 and 4000. Guessing that should be supported and that refers to the length of each Gibson piece and not the overlap arms
I’ll try the following and report back:
1) Adding configured Tm heads and tails.
2) Treating the backbone separately
Thanks for the pointer to GoldenHinges and design overhang web app.Guessing DNAweaver is the equivalent for the best Gibson cut points and hoping I am using it right
Yes you're right, I got it confused with TmSegmentSelector's parameter min_size and max_size; see here: https://github.com/Edinburgh-Genome-Foundry/DnaWeaver/blob/f72d79f13c3e17501616944ca636aa530a1a6ed9/dnaweaver/SegmentSelector/TmSegmentSelector.py#L28
Defaults are 18 and 22
Guessing DNAweaver is the equivalent for the best Gibson cut points
Note that Weaver also supports golden gate assembly, but not as thoroughly as GoldenHinges as it's just not it's main focus.
Hi all, I have finally managed to troubleshoot the plasmid assembly and get DNAweaver to suggest appropriate cut points and fragments and create a full report.
For some reason this particular sequence needed me to adjust the values for "min_size" and "max_size" for the Gibson arms after which it just worked-regardless of which start position I tried in the code below.
overhang_selector=dw.TmSegmentSelector(min_tm=55, max_tm=65, min_size=18, max_size=28)
Once I put in the "topology=circular" ( thanks for the pointers to the Circular sequence post) , the arms had well designed overlaps to allow for a full circular assembly. I didnt realize that DNAweaver would not automatically march through the sequence assuming it is circular..but once I got the code tweaked it worked beautifully.
Attaching my code below. Feedback on how I could do this more optimally is hugely appreciated. Thanks Hari
import dnaweaver as dw
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
twist_dnafragments_offer = dw.CommercialDnaOffer(
name="TwistDNAFragments",
sequence_constraints=[
dw.SequenceLengthConstraint(min_length=300,max_length=1800)
],
pricing=dw.PerBasepairPricing(0.07),
lead_time=40
)
twist_dnaclonalgenes_offer = dw.CommercialDnaOffer(
name="TwistDNAClonalGenes",
sequence_constraints=[
dw.SequenceLengthConstraint(min_length=300,max_length=5000)
],
pricing=dw.PerBasepairPricing(0.09),
lead_time=60
)
def chop_sequence(sequence):
sequence_list = []
pieces_length = len(sequence)//10
for i in range(1,11):
sequence_list.append(sequence[pieces_length*i:]+ sequence[:pieces_length*i])
return sequence_list
assembly_station = dw.DnaAssemblyStation(
name="Gibson Assembly Station",
assembly_method=dw.GibsonAssemblyMethod(
overhang_selector=dw.TmSegmentSelector(min_tm=55, max_tm=65, min_size=18, max_size=28),
min_segment_length=1000,
max_segment_length=4000,
duration=5
),
supplier=[twist_dnafragments_offer, twist_dnaclonalgenes_offer],
coarse_grain=20,
logger="bar",
fine_grain=1,
)
if __name__=="__main__":
sequence = """tctggtggttctcccaagaagaagaggaaagtctaaccggtcatcatcaccatcaccattgagtttaaacccgctgatcagcctcgactgtgccttctagttgccagccatctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccactcccactgtcctttcctaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctggggggtggggtggggcaggacagcaagggggaggattgggaagacaatagcaggcatgctggggatgcggtgggctctatggcttctgaggcggaaagaaccagctggggctcgataccgtcgacctctagctagagcttggcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaagcctagggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtcgacggatcgggagatcgatctcccgatcccctagggtcgactctcagtacaatctgctctgatgccgcatagttaagccagtatctgctccctgcttgtgtgttggaggtcgctgagtagtgcgcgagcaaaatttaagctacaacaaggcaaggcttgaccgacaattgcatgaagaatctgcttagggttaggcgttttgcgctgcttcgcgatgtacgggccagatatacgcgttgacattgattattgactagttattaatagtaatcaattacggggtcattagttcatagcccatatatggagttccgcgttacataacttacggtaaatggcccgcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagtaacgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgcctggcattatgcccagtacatgaccttatgggactttcctacttggcagtacatctacgtattagtcatcgctattaccatggtgatgcggttttggcagtacatcaatgggcgtggatagcggtttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttttggcaccaaaatcaacgggactttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtaggcgtgtacggtgggaggtctatataagcagagctggtttagtgaaccgtcagatccgctagagatccgcggccgctaatacgactcactatagggagagccgccaccatgtccgaagtcgagttttcccatgagtactggatgagacacgcattgactctcgcaaagagggcttgggatgaacgcgaggtgcccgtgggggcagtactcgtgcataacaatcgcgtaatcggcgaaggttggaataggccgatcggacgccacgaccccactgcacatgcggaaatcatggcccttcgacagggagggcttgtgatgcagaattatcgacttatcgatgcgacgctgtacgtcacgcttgaaccttgcgtaatgtgcgcgggagctatgattcactcccgcattggacgagttgtattcggtgcccgcgacgccaagacgggtgccgcaggttcactgatggacgtgctgcatcacccaggcatgaaccaccgggtagaaatcacagaaggcatattggcggacgaatgtgcggcgctgttgtccgacttttttcgcatgcggaggcaggagatcaaggcccagaaaaaagcacaatcctctactgactctggtggttcttctggtggttctagcggcagcgagactcccgggacctcagagtccgccacacccgaaagttctggtggttcttctggtggttcttccgaagtcgagttttcccatgagtactggatgagacacgcattgactctcgcaaagagggctcgagatgaacgcgaggtgcccgtgggggcagtactcgtgctcaacaatcgcgtaatcggcgaaggttggaatagggcaatcggactccacgaccccactgcacatgcggaaatcatggcccttcgacagggagggcttgtgatgcagaattatcgacttatcgatgcgacgctgtacgtcacgtttgaaccttgcgtaatgtgcgcgggagctatgattcactcccgcattggacgagttgtattcggtgttcgcaacgccaagacgggtgccgcaggttcactgatggacgtgctgcattacccaggcatgaaccaccgggtagaaatcacagaaggcatattggcggacgaatgtgcggcgctgttgtgttacttttttcgcatgcccaggcaggtctttaacgcccagaaaaaagcacaatcctctactgactctggtggttcttctggtggttctagcggcagcgagactcccgggacctcagagtccgccacacccgaaagttctggtggttcttctggtggttctgataaaaagtattctattggtttagccatcggcactaattccgttggatgggctgtcataaccgatgaatacaaagtaccttcaaagaaatttaaggtgttggggaacacagaccgtcattcgattaaaaagaatcttatcggtgccctcctattcgatagtggcgaaacggcagaggcgactcgcctgaaacgaaccgctcggagaaggtatacacgtcgcaagaaccgaatatgttacttacaagaaatttttagcaatgagatggccaaagttgacgattctttctttcaccgtttggaagagtccttccttgtcgaagaggacaagaaacatgaacggcaccccatctttggaaacatagtagatgaggtggcatatcatgaaaagtacccaacgatttatcacctcagaaaaaagctagttgactcaactgataaagcggacctgaggttaatctacttggctcttgcccatatgataaagttccgtgggcactttctcattgagggtgatctaaatccggacaactcggatgtcgacaaactgttcatccagttagtacaaacctataatcagttgtttgaagagaaccctataaatgcaagtggcgtggatgcgaaggctattcttagcgcccgcctctctaaatcccgacggctagaaaacctgatcgcacaattacccggagagaagaaaaatgggttgttcggtaaccttatagcgctctcactaggcctgacaccaaattttaagtcgaacttcgacttagctgaagatgccaaattgcagcttagtaaggacacgtacgatgacgatctcgacaatctactggcacaaattggagatcagtatgcggacttatttttggctgccaaaaaccttagcgatgcaatcctcctatctgacatactgagagttaatactgagattaccaaggcgccgttatccgcttcaatgatcaaaaggtacgatgaacatcaccaagacttgacacttctcaaggccctagtccgtcagcaactgcctgagaaatataaggaaatattctttgatcagtcgaaaaacgggtacgcaggttatattgacggcggagcgagtcaagaggaattctacaagtttatcaaacccatattagagaagatggatgggacggaagagttgcttgtaaaactcaatcgcgaagatctactgcgaaagcagcggactttcgacaacggtagcattccacatcaaatccacttaggcgaattgcatgctatacttagaaggcaggaggatttttatccgttcctcaaagacaatcgtgaaaagattgagaaaatcctaacctttcgcataccttactatgtgggacccctggcccgagggaactctcggttcgcatggatgacaagaaagtccgaagaaacgattactccatggaattttgaggaagttgtcgataaaggtgcgtcagctcaatcgttcatcgagaggatgaccaactttgacaagaatttaccgaacgaaaaagtattgcctaagcacagtttactttacgagtatttcacagtgtacaatgaactcacgaaagttaagtatgtcactgagggcatgcgtaaacccgcctttctaagcggagaacagaagaaagcaatagtagatctgttattcaagaccaaccgcaaagtgacagttaagcaattgaaagaggactactttaagaaaattgaatgcttcgattctgtcgagatctccggggtagaagatcgatttaatgcgtcacttggtacgtatcatgacctcctaaagataattaaagataaggacttcctggataacgaagagaatgaagatatcttagaagatatagtgttgactcttaccctctttgaagatcgggaaatgattgaggaaagactaaaaacatacgctcacctgttcgacgataaggttatgaaacagttaaagaggcgtcgctatacgggctggggacgattgtcgcggaaacttatcaacgggataagagacaagcaaagtggtaaaactattctcgattttctaaagagcgacggcttcgccaataggaactttatgcagctgatccatgatgactctttaaccttcaaagaggatatacaaaaggcacaggtttccggacaaggggactcattgcacgaacatattgcgaatcttgctggttcgccagccatcaaaaagggcatactccagacagtcaaagtagtggatgagctagttaaggtcatgggacgtcacaaaccggaaaacattgtaatcgagatggcacgcgaaaatcaaacgactcagaaggggcaaaaaaacagtcgagagcggatgaagagaatagaagagggtattaaagaactgggcagccagatcttaaaggagcatcctgtggaaaatacccaattgcagaacgagaaactttacctctattacctacaaaatggaagggacatgtatgttgatcaggaactggacataaaccgtttatctgattacgacgtcgatcacattgtaccccaatcctttttgaaggacgattcaatcgacaataaagtgcttacacgctcggataagaaccgagggaaaagtgacaatgttccaagcgaggaagtcgtaaagaaaatgaagaactattggcggcagctcctaaatgcgaaactgataacgcaaagaaagttcgataacttaactaaagctgagaggggtggcttgtctgaacttgacaaggccggatttattaaacgtcagctcgtggaaacccgccaaatcacaaagcatgttgcacagatactagattcccgaatgaatacgaaatacgacgagaacgataagctgattcgggaagtcaaagtaatcactttaaagtcaaaattggtgtcggacttcagaaaggattttcaattctataaagttagggagataaataactaccaccatgcgcacgacgcttatcttaatgccgtcgtagggaccgcactcattaagaaatacccgaagctagaaagtgagtttgtgtatggtgattacaaagtttatgacgtccgtaagatgatcgcgaaaagcgaacaggagataggcaaggctacagccaaatacttcttttattctaacattatgaatttctttaagacggaaatcactctggcaaacggagagatacgcaaacgacctttaattgaaaccaatggggagacaggtgaaatcgtatgggataagggccgggacttcgcgacggtgagaaaagttttgtccatgccccaagtcaacatagtaaagaaaactgaggtgcagaccggagggttttcaaaggaatcgattcttccaaaaaggaatagtgataagctcatcgctcgtaaaaaggactgggacccgaaaaagtacggtggcttcgatagccctacagttgcctattctgtcctagtagtggcaaaagttgagaagggaaaatccaagaaactgaagtcagtcaaagaattattggggataacgattatggagcgctcgtcttttgaaaagaaccccatcgacttccttgaggcgaaaggttacaaggaagtaaaaaaggatctcataattaaactaccaaagtatagtctgtttgagttagaaaatggccgaaaacggatgttggctagcgccggagagcttcaaaaggggaacgaactcgcactaccgtctaaatacgtgaatttcctgtatttagcgtcccattacgagaagttgaaaggttcacctgaagataacgaacagaagcaactttttgttgagcagcacaaacattatctcgacgaaatcatagagcaaatttcggaattcagtaagagagtcatcctagctgatgccaatctggacaaagtattaagcgcatacaacaagcacagggataaacccatacgtgagcaggcggaaaatattatccatttgtttactcttaccaacctcggcgctccagccgcattcaagtattttgacacaacgatagatcgcaaacgatacacttctaccaaggaggtgctagacgcgacactgattcaccaatccatcacgggattatatgaaactcggatagatttgtcacagcttgggggtgac"""
seqlist = chop_sequence(sequence)
for index, a_seq in enumerate(seqlist):
# desired_sequence = dw.SequenceString.from_record(rotated_record, topology='circular')
seq = Seq(a_seq)
a_seq_record = SeqRecord(
seq,
id=f"rotational_index {index}",
name=f"Rotational Start number {index}",
description=f"Sequence record for rotational Fragment {index}")
desired_sequence = dw.SequenceString.from_record(a_seq_record, topology='circular')
quote = assembly_station.get_quote(desired_sequence, with_assembly_plan=True)
print(quote.assembly_step_summary())
quote.compute_fragments_final_locations()
report = quote.to_assembly_plan_report()
report.write_full_report(f"report_{index}.zip")
print(f"Done! (see report_{index}.zip)")
Glad to hear that changing the size parameter helped solving the issue -- I can imagine that a larger size range allows DNA Weaver to choose from more options, but also that a longer sequence will tend towards a Tm average, one that will be within the Tm range.
Not sure what considerations were behind the original size parameter values, but perhaps these can be further modified, based on https://github.com/Edinburgh-Genome-Foundry/DnaWeaver/issues/14#issuecomment-1697230729 .
I realize now that this project has a very little documentation on how it works internally , it would be worth adding a link to this (very long!) presentation in the README: https://github.com/Edinburgh-Genome-Foundry/egf-shared-documents/blob/master/slideshows/dnaweaver_presentation_iwbda_2019/talk_long.pdf
That's a good idea and an excellent presentation - feel free to edit the readme. Alternatively I can add a few comments later this week, perhaps at the end of the 'How it works' section.
Hi All I am a newbie to DNAWeaver and want to use it to optimize my multipart Gibson assembly for ~8Kb plasmids
My code is given below and the error I get is
From Gibson Assembly Station - refused: No solution found! One forced cut was also invalid at indices {8851} - No solution found! One forced cut was also invalid at indices {8851}
I have tried changing the Gibson assembly station parameters and don't get anything other than the message above. How do I troubleshoot this ? Apologies if I am not understanding some basic fact about DNAweaver. I can run the examples perfectly but cant get this or other sequences to give me a result.
Thanks