EddyRivasLab / hmmer

HMMER: biological sequence analysis using profile HMMs
http://hmmer.org
Other
317 stars 70 forks source link

Alignment input parse error #147

Closed csijcs closed 6 years ago

csijcs commented 6 years ago

I'm having trouble running hmmer on a fasta of amino acid sequences. I am running:

hmmbuild -f isoformSwitchAnalyzeR_isoform_AA.hmm isoformSwitchAnalyzeR_isoform_AA.fasta

And I get the following error:

Alignment input parse error:
   sequence ENST00000000412.7 has alen 277; expected 180
   while reading aligned FASTA file isoformSwitchAnalyzeR_isoform_AA.fasta
   at or near line 10

Can you please tell me what I'm doing wrong? I have run small files on the website before, but this file has over 50,000 sequences, so too many to do on the site.

Thanks, Joe

cryptogenomicon commented 6 years ago

It means there is a problem with the format of your aligned FASTA file at line 10; the sequence "ENST00000000412.7", which is 277 symbols long, doesn't have the same length as the sequence(s) that came before it, which were 180 symbols long. Aligned sequences have to have the same length in symbols (residues + gaps). Perhaps your file is just plain unaligned FASTA. The input to hmmbuild is an alignment file. For further information, see the User's Guide.

csijcs commented 6 years ago

Thanks for your reply. I've consulted the manual and it's just not very straightforward. My favorite section is on how to not have to read the manual. Maybe you can help me. I have this fasta output from a program that I have used on the the hmmer site before without issue. I'm not clear on how exactly to use the standalone program to get the pfam analysis from this fasta. It thought I needed to build the hmm profile first, but apparently I can't do that with this unaligned fasta. So do I need to align the fasta? To what? Using tools from the hmmer program? Do I need to convert to Stockholm format? What is the process that is used by the program online? It's not clear to me in the manual because it seems to start with an aligned Stockholm file and I don't know how to get that. I have no experience with this and am trying to figure out how to use it. Any help would be much appreciated.

cryptogenomicon commented 6 years ago

You haven't explained what you're trying to do - and even if you did it's not really practical for me to give individual help on how to do your analysis, unfortunately, though I wish there were time for that. Better to chat with someone around you who's done this sort of thing before.

shiting878 commented 5 years ago

Did you solve your problem? I also happened the same trouble, could you give me a favor, plz?

cryptogenomicon commented 5 years ago

hmmbuild requires an alignment file:

   hmmbuild <output_hmmfile> <input_alignmentfile>

If the alignment file is in "FASTA" format it has to be aligned FASTA format (with gaps included; each sequence the same length).