Closed bwvogler closed 1 year ago
Thank you, yes I have started suspecting recently that there may be an issue when upper/lowercase is mixed in the sequence.
I'll have a look.
I had a look and found that in certain cases when the sequence has both upper/lowercase we have an error, and that the above solution fixes it.
The problem for the record: when the part is all uppercase or all lowercase then it works fine. If the part is mixed, and the lowercase is outside the insert region it works fine (except for receptor). In all other mixed cases (whether it's uppercase in the overhang, or inside the fragment), we have an error.
(Note: files saved in Snapgene as 'Genbank standard' are automatically turned into lowercase. Biopython and DNA Cauldron imports Genbank sequences as uppercase, and exports simulated constructs in lowercase.)
in StickyEndFragment.list_from_record_digestion (line 142), add upper() conversion, changing from existing
index = record.seq.find(fragment)
to fixedindex = record.seq.upper().find(fragment)
Current implementation is broken for source records with lowercase Seq. If there is a mandate somewhere where these should be uppercase, then instead add a useful error report here.