Closed fbosnic closed 3 years ago
Hi @fbosnic ,
The issue you're having is indeed coming from that particular line, and since I was not testing nhmmer
until now I didn't realize there was an bug there. However, after writing more extensive test cases, I also noticed that I couldn't get pyhmmer.hmmer.nhmmer
to get the same hits as the nhmmer
binary. I implemented the pyhmmer.hmmer.nhmmer
function thinking nhmmer
was just doing the same thing as phmmer
with the DNA or RNA alphabet instead of the Amino alphabet, but since it may not be the case, I'll try to look into this issue a bit further and look if I need to rewrite the current implementation.
Hi @althonos,
I see, thanks for the answer. I would be very interested in using the function once this is sorted out.
Hi @fbosnic ,
The difference between nhmmer
and phmmer
is that nhmmer
will use the "long target pipeline" instead of the default pipeline used by everything else. This allows pipelined searches to run on target sequences longer than 100,000 residues using a sliding window to process each target.
I've released v0.4.9
, which has a fixed implementation for nhmmer
. The function now accepts HMM
arguments in addition to DigitalSequence
and DigitalMSA
arguments, and will run a long target search pipeline. I have added tests to make sure the results are consistent between HMMER and PyHMMER on the same query HMM and target genome.
I'm trying to run nhmmer function but can't seem to get the alphabet right. Running
results in
Am I using it correctly, does the alphabet need to be set somewhere else as well?
As far as I could trace it, the following line might be the cause https://github.com/althonos/pyhmmer/blob/eb6fe7c0e74557e0ae9d647693711583d2d86b68/pyhmmer/hmmer.py#L82