Closed ptrebert closed 2 years ago
Can you send a reproducible test case w/ files and command line, please?
Confidentially, yes - can I share that via Globus?
Preferable to create a small and non-confidential test case, if possible.
Ok, I truncated both sequences, but the error is still being triggered:
$ nhmmer --cpu 1 -o text.out --tblout table.out -E 1.60E-150 --dna query.fasta target-10k.fasta
Parse failed (sequence file target-10k.fasta):
Line 2: unexpected char A; expected FASTA to start with >
Sequence composition of the target is A 3332 C 5002 G 0 T1666 testcase.tar.gz
@cryptogenomicon I have another occurrence of the same error in a different sample in case the test data above are not sufficient to diagnose and fix the problem
Hi @cryptogenomicon can you estimate when a fix for this issue will be available in develop?
I've just noticed this issue thread, and that the problem is showing up in nhmmer. I can reproduce the error. After a bit of exploration, it looks like the error disappears if a single 'G' is added anywhere in the first half (or so) of the target sequence. That's not the expected behavior (of course). I also confirm that phmmer on the same input does not produce an error.
I can take a deeper look at this tomorrow, unless someone else is knee deep in the problem.
Thanks @traviswheeler for taking care of this!
A fix for this issue has been merged into the develop branch.
Thanks a lot!
Hi, I am trying to run HMMER on a pair of query/target FASTA files; both files were confirmed not to contain any other char but ACGT, both files are formatted as single-line FASTA files. The query file has a single entry, the target several hundred.
Now, I started by running HMMER v3.3.2 installed via Conda, and encountered the invalid alphabet error:
I found the PR #252 and built HMMER from source (latest commit to develop #8ab8e8b ; EASEL was included from develop as described in the README). Now, when I rerun the above data with the
--dna
switch, I get the following error:The
assembly.fasta
file starts as follows (line numbers included for readability):I assume a trivial formatting issue is causing this, but the error message is quite confusing. Thanks for your help.
Best, Peter