Closed chtsai0105 closed 1 year ago
Hi @chtsai0105, thank you for the bug report. I'm also getting a segfault using the test snippet you provided, I'll try to understand what's going on.
The DigitalSequenceBlock.translate
method was occasionally writing to the wrong buffer, namely the sequence of the last protein of the block. Depending to how you'd slice, you could end up writing out of the buffer bounds, and get a segfault later! That's why you were seeing different behaviour based on the sequence coordinates.
I have patched the bug, updated tests to pick up any regression, and will make a new release with the patch.
Thank you for such a quick update! Have confirmed that it behave as expected.
Hi - I was trying to load a dna coding sequence and want to further translate to peptide sequence for hmmsearch. I loaded the gzipped fasta dna sequence with
pyhmmer.easel.SequenceFile
and everything is fine so far, the Alphabet correctly identify it as dna. However when I dotranslate
my jupyter kernel keep crashing. I also tried it with regular script but still gotAborted (core dumped)
failure.I did a quick test to find out what going on. I found that when I can successfully translate the
SequenceBlock
with the first 5 entries. I can also successfully translate the block from 6th to 20th entries. But when I do it with 0 to 6, it crashed. I also checked the fasta file but cannot found anything that is suspicious.And this failure seems to be vary between machines. I did the test on another host and it success for the 0 to 6 block. But when I tried to translate the entire block it will eventually failed.
Not sure whether there is anything that I didn't noticed in my fasta so I included the fasta below. Actinomucor_elegans_CBS_100.09.cds.fasta.gz
And here is my test script - it usually failed at the last line but in would failed in
seqs[0:6].translate()
when running on jupyter. I've already updated the pyhmmer to the latest version 0.10.1.