EddyRivasLab / hmmer

HMMER: biological sequence analysis using profile HMMs
http://hmmer.org
Other
305 stars 69 forks source link

to (ali coord) and to (env coord) larger than qlen in domtableout #288

Closed ilyavs closed 1 year ago

ilyavs commented 1 year ago

Hello, I am looking at jackhmmer results of my local analysis and see cases where the to (ali coord) and to (env coord) are larger than qlen . The documentation states:

to (ali coord): The end of the MEA alignment of this domain with respect to the sequence, numbered 1..L for a sequence of L residues.

to (env coord): The end of the domain envelope on the sequence, numbered 1..L for a sequence of L residues.

So I am confused. How can the end be larger than the number of residues in the query? How do I get the coordinates of the domain in the query sequence? Thanks, Ilya.

cryptogenomicon commented 1 year ago

The query is the profile HMM that's been built from your input sequence, and its coords for each hit are in the two hmm coord columns to and from. You're looking at the coordinates for the target sequence that the profile HMM has been aligned to.

ilyavs commented 1 year ago

Oh, now I understand. Thank you very much! In hmmscan it's the same or the other way around? Seems like in hmmscan the hmm coord columns are for the target profiles, right? In that case the env coord is in the protein query coordinates?

cryptogenomicon commented 1 year ago

Yes. The HMM coords are always for the profile HMM, and the ali and env coords are always for the protein sequence. In hmmscan, the query is a protein sequence and the targets are profile HMMs.

ilyavs commented 1 year ago

Got it. Thank you. Just making sure, if I want to calculate target coverage for hmm scan I use the hmm coord and tlen. For jackhmmer I can use env coord and tlen. Is that correct?

cryptogenomicon commented 1 year ago

Off the top of my head yes, but it's best to check the documentation on this; that's why it's there.

ilyavs commented 1 year ago

Of course I checked the documentation before even posting this issue. That's why I asked for clarification. Thanks for confirming.