Following BioBrickStandardAssembly - StickyEndFragment.list_from_record_digestion() - StickyEndSeq.list_from_sequence_digestion -overhang_bit = fragments[0][:overhang]
we see that this causes the slice functionality to return incorrect sequences, with the bytestring flanked with b' and '. This problem occurs only when 2 enzymes are used (also relevant for https://github.com/Edinburgh-Genome-Foundry/DnaCauldron/issues/15).
Reproducible example:
from dnacauldron.Fragment.StickyEndFragment.StickyEndSeq import StickyEndSeq
f = StickyEndSeq('CTAGTAAAAAAAAAAA')
f[:4]
# (None-b'CTAG'-None)
The best solution seems to be to implement a StickyEndSeq.slice(from, to) method that slices on the to_standard_sequence(discard_sticky_ends=True) export and creates a new sticky end seq.
To clarify on the above, the bug appears only with 2 or more enzymes, because then a StickyEndFragment (and not Seq) is passed in list_from_sequence_digestion() to itself in the recursion.
Test
test_hierarchical_biobrick
fails with Biopython v1.79.In v1.79 "The Seq and MutableSeq classes in Bio.Seq now store their sequence contents as bytes and bytearray objects, respectively." ( https://github.com/biopython/biopython/blob/master/NEWS.rst#3-june-2021-biopython-179 )
Following
BioBrickStandardAssembly - StickyEndFragment.list_from_record_digestion() - StickyEndSeq.list_from_sequence_digestion -
overhang_bit = fragments[0][:overhang]
we see that this causes the slice functionality to return incorrect sequences, with the bytestring flanked withb'
and'
. This problem occurs only when 2 enzymes are used (also relevant for https://github.com/Edinburgh-Genome-Foundry/DnaCauldron/issues/15).Reproducible example:
The best solution seems to be to implement a
StickyEndSeq.slice(from, to)
method that slices on theto_standard_sequence(discard_sticky_ends=True)
export and creates a new sticky end seq.