EddyRivasLab / easel

Sequence analysis library used by Eddy/Rivas lab code
Other
46 stars 26 forks source link

sqncbi_ReadWindow misses first sequence after being rewound #6

Closed traviswheeler closed 7 years ago

traviswheeler commented 8 years ago

To reproduce with hmmsearch (h3-develop):

% cat tutorial/globins4.hmm tutorial/globins4.hmm > twomodels.hmm
% makeblastdb  -in tutorial/globins45.fa  -dbtype prot -parse_seqids -out globins
% src/hmmsearch  --tformat ncbi  twomodels.hmm globins | grep Target
Target sequences:                           45  (6519 residues searched)
Target sequences:                           44  (6366 residues searched)

For all query models but the first, the first sequence in the target database gets skipped.

I've tracked it down to the line index = ncbi->index + 1; in sqncbi_ReadWindow(). The first time through that function (for the first query model), at line 1060, ncbi->index == -1 In subsequent iterations ncbi->index == 0 The result is that all queries after the first start against sequence 0+1=1, meaning that the first sequence is skipped.

This appears to be the result of the call that's supposed to reset (rewind) the target database: esl_sqfile_Position(dbfp, 0); which is called for every query model after the first in hmmsearch's serial_master(). I do not have a solution.