Closed olgabot closed 4 years ago
timeit.timeit(
...: 'TranslateSingleSeq.three_frame_translation(Seq("CGCTTGCTTAATACTGACATCAATAATATTAGGAAAATCGCAATATAACTGTAAATCCTGTTCTGTC"))',
...: setup='from Bio.Seq import Seq\nfrom sencha.translate_single_seq import TranslateSingleSeq',
...: number=int(1e6))
Out[13]: 0.6314636569999834
New way:
from sencha.constants_translate import STANDARD_CODON_TABLE_MAPPING
timeit.timeit(
...: 'TranslateSingleSeq.three_frame_translation(Seq("CGCTTGCTTAATACTGACATCAATAATATTAGGAAAATCGCAATATAACTGTAAATCCTGTTCTGTC"))',
...: setup='from Bio.Seq import Seq\nfrom sencha.translate_single_seq import TranslateSingleSeq',
...: number=int(1e6))
Out[18]: 0.5854893400000094
Hmm, only 8% faster??
0.5854893400000094/0.6314636569999834
Out[19]: 0.9271940411925002
Now the reads just stay as pure Python strings! No Biopython backend necessary. The translation happens in translate_single_seq.py
using the STANDARD_CODON_TABLE
specified in constants_translate.py
.
This PR removes the biopython dependency because a lot of time is spent converting between Python strings and Biopython
Seq
objects and back, which makessencha translate
take foreverPR checklist
pytest
ormake coverage
if you want to see which lines don't have tests yet)black . --check
).usage.md
is updatedREADME.md
is updated