Edinburgh-Genome-Foundry / DnaWeaver

A route planner for DNA assembly
https://dnaweaver.genomefoundry.org
MIT License
29 stars 9 forks source link

What to do when Gibson assembly Station is refused : Apologies to put a routine use question as an issue #14

Open harirngd opened 1 year ago

harirngd commented 1 year ago

Hi All I am a newbie to DNAWeaver and want to use it to optimize my multipart Gibson assembly for ~8Kb plasmids

My code is given below and the error I get is

From Gibson Assembly Station - refused: No solution found! One forced cut was also invalid at indices {8851} - No solution found! One forced cut was also invalid at indices {8851}

I have tried changing the Gibson assembly station parameters and don't get anything other than the message above. How do I troubleshoot this ? Apologies if I am not understanding some basic fact about DNAweaver. I can run the examples perfectly but cant get this or other sequences to give me a result.

Thanks

import dnaweaver as dw

twist_dnafragments_offer = dw.CommercialDnaOffer(
    name="TwistDNAFragments",
    sequence_constraints=[
        dw.SequenceLengthConstraint(min_length=300,max_length=1800)
    ],
    pricing=dw.PerBasepairPricing(0.07),
    lead_time=40
)

twist_dnafclonalgenes_offer = dw.CommercialDnaOffer(
    name="TwistDNAClonalGenes",
    sequence_constraints=[
        dw.SequenceLengthConstraint(min_length=300,max_length=5000)
    ],
    pricing=dw.PerBasepairPricing(0.09),
    lead_time=40
)

assembly_station = dw.DnaAssemblyStation(
    name="Gibson Assembly Station",
    assembly_method=dw.GibsonAssemblyMethod(
        overhang_selector=dw.TmSegmentSelector(min_tm=55, max_tm=65),
        min_segment_length=1000,
        max_segment_length=4000,
        duration=5
    ),
    supplier=[twist_dnafragments_offer,twist_dnafclonalgenes_offer],
    coarse_grain=20
)
# sequence = dw.random_dna_sequence(8851, seed=123)
sequence = """CAGTACTCGTGCATAACAATCGCGTAATCGGCGAAGGTTGGAATAGGCCGATCGGACGCCACGACCCCAC
TGCACATGCGGAAATCATGGCCCTTCGACAGGGAGGGCTTGTGATGCAGAATTATCGACTTATCGATGCG
ACGCTGTACGTCACGCTTGAACCTTGCGTAATGTGCGCGGGAGCTATGATTCACTCCCGCATTGGACGAG
TTGTATTCGGTGCCCGCGACGCCAAGACGGGTGCCGCAGGTTCACTGATGGACGTGCTGCATCACCCAGG
CATGAACCACCGGGTAGAAATCACAGAAGGCATATTGGCGGACGAATGTGCGGCGCTGTTGTCCGACTTT
TTTCGCATGCGGAGGCAGGAGATCAAGGCCCAGAAAAAAGCACAATCCTCTACTGACTCTGGTGGTTCTT
CTGGTGGTTCTAGCGGCAGCGAGACTCCCGGGACCTCAGAGTCCGCCACACCCGAAAGTTCTGGTGGTTC
TTCTGGTGGTTCTTCCGAAGTCGAGTTTTCCCATGAGTACTGGATGAGACACGCATTGACTCTCGCAAAG
AGGGCTCGAGATGAACGCGAGGTGCCCGTGGGGGCAGTACTCGTGCTCAACAATCGCGTAATCGGCGAAG
GTTGGAATAGGGCAATCGGACTCCACGACCCCACTGCACATGCGGAAATCATGGCCCTTCGACAGGGAGG
GCTTGTGATGCAGAATTATCGACTTATCGATGCGACGCTGTACGTCACGTTTGAACCTTGCGTAATGTGC
GCGGGAGCTATGATTCACTCCCGCATTGGACGAGTTGTATTCGGTGTTCGCAACGCCAAGACGGGTGCCG
CAGGTTCACTGATGGACGTGCTGCATTACCCAGGCATGAACCACCGGGTAGAAATCACAGAAGGCATATT
GGCGGACGAATGTGCGGCGCTGTTGTGTTACTTTTTTCGCATGCCCAGGCAGGTCTTTAACGCCCAGAAA
AAAGCACAATCCTCTACTGACTCTGGTGGTTCTTCTGGTGGTTCTAGCGGCAGCGAGACTCCCGGGACCT
CAGAGTCCGCCACACCCGAAAGTTCTGGTGGTTCTTCTGGTGGTTCTGATAAAAAGTATTCTATTGGTTT
AGCCATCGGCACTAATTCCGTTGGATGGGCTGTCATAACCGATGAATACAAAGTACCTTCAAAGAAATTT
AAGGTGTTGGGGAACACAGACCGTCATTCGATTAAAAAGAATCTTATCGGTGCCCTCCTATTCGATAGTG
GCGAAACGGCAGAGGCGACTCGCCTGAAACGAACCGCTCGGAGAAGGTATACACGTCGCAAGAACCGAAT
ATGTTACTTACAAGAAATTTTTAGCAATGAGATGGCCAAAGTTGACGATTCTTTCTTTCACCGTTTGGAA
GAGTCCTTCCTTGTCGAAGAGGACAAGAAACATGAACGGCACCCCATCTTTGGAAACATAGTAGATGAGG
TGGCATATCATGAAAAGTACCCAACGATTTATCACCTCAGAAAAAAGCTAGTTGACTCAACTGATAAAGC
GGACCTGAGGTTAATCTACTTGGCTCTTGCCCATATGATAAAGTTCCGTGGGCACTTTCTCATTGAGGGT
GATCTAAATCCGGACAACTCGGATGTCGACAAACTGTTCATCCAGTTAGTACAAACCTATAATCAGTTGT
TTGAAGAGAACCCTATAAATGCAAGTGGCGTGGATGCGAAGGCTATTCTTAGCGCCCGCCTCTCTAAATC
CCGACGGCTAGAAAACCTGATCGCACAATTACCCGGAGAGAAGAAAAATGGGTTGTTCGGTAACCTTATA
GCGCTCTCACTAGGCCTGACACCAAATTTTAAGTCGAACTTCGACTTAGCTGAAGATGCCAAATTGCAGC
TTAGTAAGGACACGTACGATGACGATCTCGACAATCTACTGGCACAAATTGGAGATCAGTATGCGGACTT
ATTTTTGGCTGCCAAAAACCTTAGCGATGCAATCCTCCTATCTGACATACTGAGAGTTAATACTGAGATT
ACCAAGGCGCCGTTATCCGCTTCAATGATCAAAAGGTACGATGAACATCACCAAGACTTGACACTTCTCA
AGGCCCTAGTCCGTCAGCAACTGCCTGAGAAATATAAGGAAATATTCTTTGATCAGTCGAAAAACGGGTA
CGCAGGTTATATTGACGGCGGAGCGAGTCAAGAGGAATTCTACAAGTTTATCAAACCCATATTAGAGAAG
ATGGATGGGACGGAAGAGTTGCTTGTAAAACTCAATCGCGAAGATCTACTGCGAAAGCAGCGGACTTTCG
ACAACGGTAGCATTCCACATCAAATCCACTTAGGCGAATTGCATGCTATACTTAGAAGGCAGGAGGATTT
TTATCCGTTCCTCAAAGACAATCGTGAAAAGATTGAGAAAATCCTAACCTTTCGCATACCTTACTATGTG
GGACCCCTGGCCCGAGGGAACTCTCGGTTCGCATGGATGACAAGAAAGTCCGAAGAAACGATTACTCCAT
GGAATTTTGAGGAAGTTGTCGATAAAGGTGCGTCAGCTCAATCGTTCATCGAGAGGATGACCAACTTTGA
CAAGAATTTACCGAACGAAAAAGTATTGCCTAAGCACAGTTTACTTTACGAGTATTTCACAGTGTACAAT
GAACTCACGAAAGTTAAGTATGTCACTGAGGGCATGCGTAAACCCGCCTTTCTAAGCGGAGAACAGAAGA
AAGCAATAGTAGATCTGTTATTCAAGACCAACCGCAAAGTGACAGTTAAGCAATTGAAAGAGGACTACTT
TAAGAAAATTGAATGCTTCGATTCTGTCGAGATCTCCGGGGTAGAAGATCGATTTAATGCGTCACTTGGT
ACGTATCATGACCTCCTAAAGATAATTAAAGATAAGGACTTCCTGGATAACGAAGAGAATGAAGATATCT
TAGAAGATATAGTGTTGACTCTTACCCTCTTTGAAGATCGGGAAATGATTGAGGAAAGACTAAAAACATA
CGCTCACCTGTTCGACGATAAGGTTATGAAACAGTTAAAGAGGCGTCGCTATACGGGCTGGGGACGATTG
TCGCGGAAACTTATCAACGGGATAAGAGACAAGCAAAGTGGTAAAACTATTCTCGATTTTCTAAAGAGCG
ACGGCTTCGCCAATAGGAACTTTATGCAGCTGATCCATGATGACTCTTTAACCTTCAAAGAGGATATACA
AAAGGCACAGGTTTCCGGACAAGGGGACTCATTGCACGAACATATTGCGAATCTTGCTGGTTCGCCAGCC
ATCAAAAAGGGCATACTCCAGACAGTCAAAGTAGTGGATGAGCTAGTTAAGGTCATGGGACGTCACAAAC
CGGAAAACATTGTAATCGAGATGGCACGCGAAAATCAAACGACTCAGAAGGGGCAAAAAAACAGTCGAGA
GCGGATGAAGAGAATAGAAGAGGGTATTAAAGAACTGGGCAGCCAGATCTTAAAGGAGCATCCTGTGGAA
AATACCCAATTGCAGAACGAGAAACTTTACCTCTATTACCTACAAAATGGAAGGGACATGTATGTTGATC
AGGAACTGGACATAAACCGTTTATCTGATTACGACGTCGATCACATTGTACCCCAATCCTTTTTGAAGGA
CGATTCAATCGACAATAAAGTGCTTACACGCTCGGATAAGAACCGAGGGAAAAGTGACAATGTTCCAAGC
GAGGAAGTCGTAAAGAAAATGAAGAACTATTGGCGGCAGCTCCTAAATGCGAAACTGATAACGCAAAGAA
AGTTCGATAACTTAACTAAAGCTGAGAGGGGTGGCTTGTCTGAACTTGACAAGGCCGGATTTATTAAACG
TCAGCTCGTGGAAACCCGCCAAATCACAAAGCATGTTGCACAGATACTAGATTCCCGAATGAATACGAAA
TACGACGAGAACGATAAGCTGATTCGGGAAGTCAAAGTAATCACTTTAAAGTCAAAATTGGTGTCGGACT
TCAGAAAGGATTTTCAATTCTATAAAGTTAGGGAGATAAATAACTACCACCATGCGCACGACGCTTATCT
TAATGCCGTCGTAGGGACCGCACTCATTAAGAAATACCCGAAGCTAGAAAGTGAGTTTGTGTATGGTGAT
TACAAAGTTTATGACGTCCGTAAGATGATCGCGAAAAGCGAACAGGAGATAGGCAAGGCTACAGCCAAAT
ACTTCTTTTATTCTAACATTATGAATTTCTTTAAGACGGAAATCACTCTGGCAAACGGAGAGATACGCAA
ACGACCTTTAATTGAAACCAATGGGGAGACAGGTGAAATCGTATGGGATAAGGGCCGGGACTTCGCGACG
GTGAGAAAAGTTTTGTCCATGCCCCAAGTCAACATAGTAAAGAAAACTGAGGTGCAGACCGGAGGGTTTT
CAAAGGAATCGATTCTTCCAAAAAGGAATAGTGATAAGCTCATCGCTCGTAAAAAGGACTGGGACCCGAA
AAAGTACGGTGGCTTCGATAGCCCTACAGTTGCCTATTCTGTCCTAGTAGTGGCAAAAGTTGAGAAGGGA
AAATCCAAGAAACTGAAGTCAGTCAAAGAATTATTGGGGATAACGATTATGGAGCGCTCGTCTTTTGAAA
AGAACCCCATCGACTTCCTTGAGGCGAAAGGTTACAAGGAAGTAAAAAAGGATCTCATAATTAAACTACC
AAAGTATAGTCTGTTTGAGTTAGAAAATGGCCGAAAACGGATGTTGGCTAGCGCCGGAGAGCTTCAAAAG
GGGAACGAACTCGCACTACCGTCTAAATACGTGAATTTCCTGTATTTAGCGTCCCATTACGAGAAGTTGA
AAGGTTCACCTGAAGATAACGAACAGAAGCAACTTTTTGTTGAGCAGCACAAACATTATCTCGACGAAAT
CATAGAGCAAATTTCGGAATTCAGTAAGAGAGTCATCCTAGCTGATGCCAATCTGGACAAAGTATTAAGC
GCATACAACAAGCACAGGGATAAACCCATACGTGAGCAGGCGGAAAATATTATCCATTTGTTTACTCTTA
CCAACCTCGGCGCTCCAGCCGCATTCAAGTATTTTGACACAACGATAGATCGCAAACGATACACTTCTAC
CAAGGAGGTGCTAGACGCGACACTGATTCACCAATCCATCACGGGATTATATGAAACTCGGATAGATTTG
TCACAGCTTGGGGGTGACTCTGGTGGTTCTCCCAAGAAGAAGAGGAAAGTCTAACCGGTCATCATCACCA
TCACCATTGAGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGC
CCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAA
TTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGA
GGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACC
AGCTGGGGCTCGATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGT
GAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTAGGGTGC
CTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCG
TGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTT
CCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGT
AATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCC
AGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAA
ATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAG
CTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGA
AGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGG
GCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAA
CCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTA
GGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCT
GCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGC
TGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCT
TTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGAT
TATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATA
TGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTT
CGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCC
CCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGC
CGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGG
GAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGG
TGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATC
CCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCA
GTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTT
CTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCC
GGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCT
TCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCA
ACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGC
AAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGC
ATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGG
TTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCGATCTCCCGATCCCCT
AGGGTCGACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTG
TTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCA
TGAAGAATCTGCTTAGGGTTAGGCGTTTTGCGCTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACA
TTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTC
CGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAA
TAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACG
GTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGAC
GGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCT
ACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTT
TGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAA
CGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGG
AGGTCTATATAAGCAGAGCTGGTTTAGTGAACCGTCAGATCCGCTAGAGATCCGCGGCCGCTAATACGAC
TCACTATAGGGAGAGCCGCCACCATGTCCGAAGTCGAGTTTTCCCATGAGTACTGGATGAGACACGCATT
GACTCTCGCAAAGAGGGCTTGGGATGAACGCGAGGTGCCCGTGGGGG"""
quote = assembly_station.get_quote(sequence, with_assembly_plan=True)
Zulko commented 1 year ago

The error is saying that the end of the sequence 8851 is incompatible with the assembly overhang selector dw.TmSegmentSelector(min_tm=55, max_tm=65), probably because the sequence's end is GC rich and so has high Tm. Note that DnaWeaver assumes that the end of the sequence will anneal with e.g. a receptor plasmid and so is subject to the min_tm/max_tm constraint.

Some ideas to go around it:

harirngd commented 1 year ago

Thanks @Zulko . I was about to post that deleting the end made it succeed. Also putting the end at the beginning ( since it was circular) made it succeed . I am going to try specifying its a circular plasmid and also your approach of adding a 20bp adaptor. Wondering how to have it optimize it as a circular plasmid for the assembly?--is there a way where it optimized where to split the circular assembly?

veghp commented 1 year ago

I had a similar problem recently, where I needed to assemble a circular plasmid. But I used Golden Gate so the problem was finding overhangs with GoldenHinges (or design overhangs webapp). The solution was to set cutpoints upstream and downstream so the two ends make up about the desired fragment length, and ignore DNA ends. (Any ~100 bp segment should contain a suitable overhang.) Then I joined the two end fragments in a sequence editor. I think one key advantage of Golden Gate is that the efficiency is not sequence-dependent.

In your case, one option is to decide which section you want to use as 'backbone' and treat it separately. Alternatively, here is a discussion on DNA Weaver with circular sequences: https://github.com/Edinburgh-Genome-Foundry/DnaWeaver/issues/1

I'm not aware of an option that forces overlap region locations.

Also a comment, your Gibson segment lengths look a bit long:

    min_segment_length=1000,
    max_segment_length=4000

Maybe this is just an example of you trying out a few parameters. In any case, if we look at the user manuals on the Telesis Bio website (the company where Dr Gibson is the CTO), we see 40 bp overlap regions (page 10 PDF).

harirngd commented 1 year ago

Thanks @Zulko and Peter .

@veghp I used the segment lengths from one of the example data , which has 1000 and 4000. Guessing that should be supported and that refers to the length of each Gibson piece and not the overlap arms

I’ll try the following and report back: 1) Adding configured Tm heads and tails.
2) Treating the backbone separately

Thanks for the pointer to GoldenHinges and design overhang web app.Guessing DNAweaver is the equivalent for the best Gibson cut points and hoping I am using it right

veghp commented 1 year ago

Yes you're right, I got it confused with TmSegmentSelector's parameter min_size and max_size; see here: https://github.com/Edinburgh-Genome-Foundry/DnaWeaver/blob/f72d79f13c3e17501616944ca636aa530a1a6ed9/dnaweaver/SegmentSelector/TmSegmentSelector.py#L28

Defaults are 18 and 22

Zulko commented 1 year ago

Guessing DNAweaver is the equivalent for the best Gibson cut points

Note that Weaver also supports golden gate assembly, but not as thoroughly as GoldenHinges as it's just not it's main focus.

harirngd commented 1 year ago

Hi all, I have finally managed to troubleshoot the plasmid assembly and get DNAweaver to suggest appropriate cut points and fragments and create a full report.

For some reason this particular sequence needed me to adjust the values for "min_size" and "max_size" for the Gibson arms after which it just worked-regardless of which start position I tried in the code below.

overhang_selector=dw.TmSegmentSelector(min_tm=55, max_tm=65, min_size=18, max_size=28)

Once I put in the "topology=circular" ( thanks for the pointers to the Circular sequence post) , the arms had well designed overlaps to allow for a full circular assembly. I didnt realize that DNAweaver would not automatically march through the sequence assuming it is circular..but once I got the code tweaked it worked beautifully.

Attaching my code below. Feedback on how I could do this more optimally is hugely appreciated. Thanks Hari

import dnaweaver as dw
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord

twist_dnafragments_offer = dw.CommercialDnaOffer(
    name="TwistDNAFragments",
    sequence_constraints=[
        dw.SequenceLengthConstraint(min_length=300,max_length=1800)
    ],
    pricing=dw.PerBasepairPricing(0.07),
    lead_time=40
)

twist_dnaclonalgenes_offer = dw.CommercialDnaOffer(
    name="TwistDNAClonalGenes",
    sequence_constraints=[
        dw.SequenceLengthConstraint(min_length=300,max_length=5000)
    ],
    pricing=dw.PerBasepairPricing(0.09),
    lead_time=60
)
def chop_sequence(sequence):
    sequence_list = []
    pieces_length = len(sequence)//10
    for i in range(1,11):
        sequence_list.append(sequence[pieces_length*i:]+ sequence[:pieces_length*i])
    return sequence_list

assembly_station = dw.DnaAssemblyStation(
    name="Gibson Assembly Station",
    assembly_method=dw.GibsonAssemblyMethod(
        overhang_selector=dw.TmSegmentSelector(min_tm=55, max_tm=65, min_size=18, max_size=28),
        min_segment_length=1000,
        max_segment_length=4000,
        duration=5
    ),
    supplier=[twist_dnafragments_offer, twist_dnaclonalgenes_offer],
    coarse_grain=20,
    logger="bar",
    fine_grain=1,
)

if __name__=="__main__":

    sequence = """tctggtggttctcccaagaagaagaggaaagtctaaccggtcatcatcaccatcaccattgagtttaaacccgctgatcagcctcgactgtgccttctagttgccagccatctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccactcccactgtcctttcctaataaaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctggggggtggggtggggcaggacagcaagggggaggattgggaagacaatagcaggcatgctggggatgcggtgggctctatggcttctgaggcggaaagaaccagctggggctcgataccgtcgacctctagctagagcttggcgtaatcatggtcatagctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacgagccggaagcataaagtgtaaagcctagggtgcctaatgagtgagctaactcacattaattgcgttgcgctcactgcccgctttccagtcgggaaacctgtcgtgccagctgcattaatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaatacggttatccacagaatcaggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccccctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcccttcgggaagcgtggcgctttctcatagctcacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacccggtaagacacgacttatcgccactggcagcagccactggtaacaggattagcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcctaactacggctacactagaagaacagtatttggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaagggattttggtcatgagattatcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttggtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatctgtctatttcgttcatccatagttgcctgactccccgtcgtgtagataactacgatacgggagggcttaccatctggccccagtgctgcaatgataccgcgagacccacgctcaccggctccagatttatcagcaataaaccagccagccggaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccagtctattaattgttgccgggaagctagagtaagtagttcgccagttaatagtttgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgctcgtcgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatggcagcactgcataattctcttactgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcgaccgagttgctcttgcccggcgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcatactcttcctttttcaatattattgaagcatttatcagggttattgtctcatgagcggatacatatttgaatgtatttagaaaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacctgacgtcgacggatcgggagatcgatctcccgatcccctagggtcgactctcagtacaatctgctctgatgccgcatagttaagccagtatctgctccctgcttgtgtgttggaggtcgctgagtagtgcgcgagcaaaatttaagctacaacaaggcaaggcttgaccgacaattgcatgaagaatctgcttagggttaggcgttttgcgctgcttcgcgatgtacgggccagatatacgcgttgacattgattattgactagttattaatagtaatcaattacggggtcattagttcatagcccatatatggagttccgcgttacataacttacggtaaatggcccgcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagtaacgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatcatatgccaagtacgccccctattgacgtcaatgacggtaaatggcccgcctggcattatgcccagtacatgaccttatgggactttcctacttggcagtacatctacgtattagtcatcgctattaccatggtgatgcggttttggcagtacatcaatgggcgtggatagcggtttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttttggcaccaaaatcaacgggactttccaaaatgtcgtaacaactccgccccattgacgcaaatgggcggtaggcgtgtacggtgggaggtctatataagcagagctggtttagtgaaccgtcagatccgctagagatccgcggccgctaatacgactcactatagggagagccgccaccatgtccgaagtcgagttttcccatgagtactggatgagacacgcattgactctcgcaaagagggcttgggatgaacgcgaggtgcccgtgggggcagtactcgtgcataacaatcgcgtaatcggcgaaggttggaataggccgatcggacgccacgaccccactgcacatgcggaaatcatggcccttcgacagggagggcttgtgatgcagaattatcgacttatcgatgcgacgctgtacgtcacgcttgaaccttgcgtaatgtgcgcgggagctatgattcactcccgcattggacgagttgtattcggtgcccgcgacgccaagacgggtgccgcaggttcactgatggacgtgctgcatcacccaggcatgaaccaccgggtagaaatcacagaaggcatattggcggacgaatgtgcggcgctgttgtccgacttttttcgcatgcggaggcaggagatcaaggcccagaaaaaagcacaatcctctactgactctggtggttcttctggtggttctagcggcagcgagactcccgggacctcagagtccgccacacccgaaagttctggtggttcttctggtggttcttccgaagtcgagttttcccatgagtactggatgagacacgcattgactctcgcaaagagggctcgagatgaacgcgaggtgcccgtgggggcagtactcgtgctcaacaatcgcgtaatcggcgaaggttggaatagggcaatcggactccacgaccccactgcacatgcggaaatcatggcccttcgacagggagggcttgtgatgcagaattatcgacttatcgatgcgacgctgtacgtcacgtttgaaccttgcgtaatgtgcgcgggagctatgattcactcccgcattggacgagttgtattcggtgttcgcaacgccaagacgggtgccgcaggttcactgatggacgtgctgcattacccaggcatgaaccaccgggtagaaatcacagaaggcatattggcggacgaatgtgcggcgctgttgtgttacttttttcgcatgcccaggcaggtctttaacgcccagaaaaaagcacaatcctctactgactctggtggttcttctggtggttctagcggcagcgagactcccgggacctcagagtccgccacacccgaaagttctggtggttcttctggtggttctgataaaaagtattctattggtttagccatcggcactaattccgttggatgggctgtcataaccgatgaatacaaagtaccttcaaagaaatttaaggtgttggggaacacagaccgtcattcgattaaaaagaatcttatcggtgccctcctattcgatagtggcgaaacggcagaggcgactcgcctgaaacgaaccgctcggagaaggtatacacgtcgcaagaaccgaatatgttacttacaagaaatttttagcaatgagatggccaaagttgacgattctttctttcaccgtttggaagagtccttccttgtcgaagaggacaagaaacatgaacggcaccccatctttggaaacatagtagatgaggtggcatatcatgaaaagtacccaacgatttatcacctcagaaaaaagctagttgactcaactgataaagcggacctgaggttaatctacttggctcttgcccatatgataaagttccgtgggcactttctcattgagggtgatctaaatccggacaactcggatgtcgacaaactgttcatccagttagtacaaacctataatcagttgtttgaagagaaccctataaatgcaagtggcgtggatgcgaaggctattcttagcgcccgcctctctaaatcccgacggctagaaaacctgatcgcacaattacccggagagaagaaaaatgggttgttcggtaaccttatagcgctctcactaggcctgacaccaaattttaagtcgaacttcgacttagctgaagatgccaaattgcagcttagtaaggacacgtacgatgacgatctcgacaatctactggcacaaattggagatcagtatgcggacttatttttggctgccaaaaaccttagcgatgcaatcctcctatctgacatactgagagttaatactgagattaccaaggcgccgttatccgcttcaatgatcaaaaggtacgatgaacatcaccaagacttgacacttctcaaggccctagtccgtcagcaactgcctgagaaatataaggaaatattctttgatcagtcgaaaaacgggtacgcaggttatattgacggcggagcgagtcaagaggaattctacaagtttatcaaacccatattagagaagatggatgggacggaagagttgcttgtaaaactcaatcgcgaagatctactgcgaaagcagcggactttcgacaacggtagcattccacatcaaatccacttaggcgaattgcatgctatacttagaaggcaggaggatttttatccgttcctcaaagacaatcgtgaaaagattgagaaaatcctaacctttcgcataccttactatgtgggacccctggcccgagggaactctcggttcgcatggatgacaagaaagtccgaagaaacgattactccatggaattttgaggaagttgtcgataaaggtgcgtcagctcaatcgttcatcgagaggatgaccaactttgacaagaatttaccgaacgaaaaagtattgcctaagcacagtttactttacgagtatttcacagtgtacaatgaactcacgaaagttaagtatgtcactgagggcatgcgtaaacccgcctttctaagcggagaacagaagaaagcaatagtagatctgttattcaagaccaaccgcaaagtgacagttaagcaattgaaagaggactactttaagaaaattgaatgcttcgattctgtcgagatctccggggtagaagatcgatttaatgcgtcacttggtacgtatcatgacctcctaaagataattaaagataaggacttcctggataacgaagagaatgaagatatcttagaagatatagtgttgactcttaccctctttgaagatcgggaaatgattgaggaaagactaaaaacatacgctcacctgttcgacgataaggttatgaaacagttaaagaggcgtcgctatacgggctggggacgattgtcgcggaaacttatcaacgggataagagacaagcaaagtggtaaaactattctcgattttctaaagagcgacggcttcgccaataggaactttatgcagctgatccatgatgactctttaaccttcaaagaggatatacaaaaggcacaggtttccggacaaggggactcattgcacgaacatattgcgaatcttgctggttcgccagccatcaaaaagggcatactccagacagtcaaagtagtggatgagctagttaaggtcatgggacgtcacaaaccggaaaacattgtaatcgagatggcacgcgaaaatcaaacgactcagaaggggcaaaaaaacagtcgagagcggatgaagagaatagaagagggtattaaagaactgggcagccagatcttaaaggagcatcctgtggaaaatacccaattgcagaacgagaaactttacctctattacctacaaaatggaagggacatgtatgttgatcaggaactggacataaaccgtttatctgattacgacgtcgatcacattgtaccccaatcctttttgaaggacgattcaatcgacaataaagtgcttacacgctcggataagaaccgagggaaaagtgacaatgttccaagcgaggaagtcgtaaagaaaatgaagaactattggcggcagctcctaaatgcgaaactgataacgcaaagaaagttcgataacttaactaaagctgagaggggtggcttgtctgaacttgacaaggccggatttattaaacgtcagctcgtggaaacccgccaaatcacaaagcatgttgcacagatactagattcccgaatgaatacgaaatacgacgagaacgataagctgattcgggaagtcaaagtaatcactttaaagtcaaaattggtgtcggacttcagaaaggattttcaattctataaagttagggagataaataactaccaccatgcgcacgacgcttatcttaatgccgtcgtagggaccgcactcattaagaaatacccgaagctagaaagtgagtttgtgtatggtgattacaaagtttatgacgtccgtaagatgatcgcgaaaagcgaacaggagataggcaaggctacagccaaatacttcttttattctaacattatgaatttctttaagacggaaatcactctggcaaacggagagatacgcaaacgacctttaattgaaaccaatggggagacaggtgaaatcgtatgggataagggccgggacttcgcgacggtgagaaaagttttgtccatgccccaagtcaacatagtaaagaaaactgaggtgcagaccggagggttttcaaaggaatcgattcttccaaaaaggaatagtgataagctcatcgctcgtaaaaaggactgggacccgaaaaagtacggtggcttcgatagccctacagttgcctattctgtcctagtagtggcaaaagttgagaagggaaaatccaagaaactgaagtcagtcaaagaattattggggataacgattatggagcgctcgtcttttgaaaagaaccccatcgacttccttgaggcgaaaggttacaaggaagtaaaaaaggatctcataattaaactaccaaagtatagtctgtttgagttagaaaatggccgaaaacggatgttggctagcgccggagagcttcaaaaggggaacgaactcgcactaccgtctaaatacgtgaatttcctgtatttagcgtcccattacgagaagttgaaaggttcacctgaagataacgaacagaagcaactttttgttgagcagcacaaacattatctcgacgaaatcatagagcaaatttcggaattcagtaagagagtcatcctagctgatgccaatctggacaaagtattaagcgcatacaacaagcacagggataaacccatacgtgagcaggcggaaaatattatccatttgtttactcttaccaacctcggcgctccagccgcattcaagtattttgacacaacgatagatcgcaaacgatacacttctaccaaggaggtgctagacgcgacactgattcaccaatccatcacgggattatatgaaactcggatagatttgtcacagcttgggggtgac"""
    seqlist = chop_sequence(sequence)
    for index, a_seq in enumerate(seqlist):
        # desired_sequence = dw.SequenceString.from_record(rotated_record, topology='circular')
        seq = Seq(a_seq)
        a_seq_record = SeqRecord(
            seq,
            id=f"rotational_index {index}",
            name=f"Rotational Start number {index}",
            description=f"Sequence record for rotational Fragment {index}")

        desired_sequence = dw.SequenceString.from_record(a_seq_record, topology='circular')
        quote = assembly_station.get_quote(desired_sequence, with_assembly_plan=True)
        print(quote.assembly_step_summary())
        quote.compute_fragments_final_locations()
        report = quote.to_assembly_plan_report()
        report.write_full_report(f"report_{index}.zip")
        print(f"Done! (see report_{index}.zip)")
veghp commented 1 year ago

Glad to hear that changing the size parameter helped solving the issue -- I can imagine that a larger size range allows DNA Weaver to choose from more options, but also that a longer sequence will tend towards a Tm average, one that will be within the Tm range.

Not sure what considerations were behind the original size parameter values, but perhaps these can be further modified, based on https://github.com/Edinburgh-Genome-Foundry/DnaWeaver/issues/14#issuecomment-1697230729 .

Zulko commented 11 months ago

I realize now that this project has a very little documentation on how it works internally , it would be worth adding a link to this (very long!) presentation in the README: https://github.com/Edinburgh-Genome-Foundry/egf-shared-documents/blob/master/slideshows/dnaweaver_presentation_iwbda_2019/talk_long.pdf

veghp commented 11 months ago

That's a good idea and an excellent presentation - feel free to edit the readme. Alternatively I can add a few comments later this week, perhaps at the end of the 'How it works' section.