Edinburgh-Genome-Foundry / DnaCauldron

:alembic: Simple cloning simulator (Golden Gate etc.) for single and combinatorial assemblies
https://edinburgh-genome-foundry.github.io/DnaCauldron/
MIT License
50 stars 11 forks source link

Test fails with Biopython v1.79 #16

Closed veghp closed 2 years ago

veghp commented 2 years ago

Test test_hierarchical_biobrick fails with Biopython v1.79.

In v1.79 "The Seq and MutableSeq classes in Bio.Seq now store their sequence contents as bytes and bytearray objects, respectively." ( https://github.com/biopython/biopython/blob/master/NEWS.rst#3-june-2021-biopython-179 )

Following BioBrickStandardAssembly - StickyEndFragment.list_from_record_digestion() - StickyEndSeq.list_from_sequence_digestion -overhang_bit = fragments[0][:overhang] we see that this causes the slice functionality to return incorrect sequences, with the bytestring flanked with b' and '. This problem occurs only when 2 enzymes are used (also relevant for https://github.com/Edinburgh-Genome-Foundry/DnaCauldron/issues/15).

Reproducible example:

from dnacauldron.Fragment.StickyEndFragment.StickyEndSeq import StickyEndSeq
f = StickyEndSeq('CTAGTAAAAAAAAAAA')
f[:4]
# (None-b'CTAG'-None)

The best solution seems to be to implement a StickyEndSeq.slice(from, to) method that slices on the to_standard_sequence(discard_sticky_ends=True) export and creates a new sticky end seq.

veghp commented 2 years ago

To clarify on the above, the bug appears only with 2 or more enzymes, because then a StickyEndFragment (and not Seq) is passed in list_from_sequence_digestion() to itself in the recursion.