EddyRivasLab / hmmer

HMMER: biological sequence analysis using profile HMMs
http://hmmer.org
Other
307 stars 69 forks source link

hmmsearch force results for all query sequences #247

Closed gillichu closed 3 years ago

gillichu commented 3 years ago

I'm currently trying to get hmmsearch to give me bitscores for all my provided query sequences. I'm working with Pfam protein datasets. I've used the '--max' flag, but some sequences are still getting filtered out in the last domain-filtering step after the MSV, Viterbi and bias composition filters are turned off. Any advice to make bitscores for the query sequence names listed in missing_seq.txt show up in the hmmsearch output (searchout.txt) would be very helpful. I've attached files to reproduce this problem of missing sequences, as well as the relevant commands and versions I used.

Versions:

My commands are:

hmmbuild --amino --cpu 5 PF00207.24.myhmm train_seq.txt
hmmsearch --noali --cpu 5 -o searchout.txt --max PF00207.24.myhmm test_queries.txt

Contents of example.zip:

cryptogenomicon commented 3 years ago

I'm sorry, there is no way to do this in hmmsearch, by design. It is designed to identify the most significant hits, not to score all sequences.

gillichu commented 3 years ago

I see, thanks for your response!