Closed ViriatoII closed 5 years ago
Can you attach the allRepeats.fasta
file so I can try to reproduce the error? If that file is too big, or you don't want to share it, can you try to reproduce the bug with a smaller file and attach that? Thanks.
Might be sufficient to just send us the sequence rnd-6_family-6587#Unknown from your allRepeats.fasta file, verbatim (i.e. exactly as it appears in the file). The problem might be with your FASTA format.
Hey, Thank you for the attention. Here goes a sample of my input. It is a one liner fasta. error_repeat.fa.txt (.txt suffix only for uploading reasons)
Works fine for me. Can you verify that your problem is reproducible on that smaller example?
esl-sfetch --index error_repeat.fa
esl-sfetch -c 60..1015 error_repeat.fa rnd-6_family-6587#Unknown
esl-sfetch -c 61..1015 error_repeat.fa rnd-6_family-6587#Unknown
Sorry for taking some time. Esl-sfetch works perfectly on my small sample, but the error still happens with the big fasta file.
I think I found the origin of the error. The original file is a concatenation of several fasta files. Some of them are multi-line and others are one-liners. I hadn't noticed this..
Your algorithm probably detects that the file is multi-liner based on the first entry and fails on one-liner entries.
The solution is streamlining the file beforehand. I don't know if this can be considered a bug then.
Kind regards, Ricardo
No, the program works fine on a mix of multiline and single line FASTA entries, so it should not fail because of that.
Based on what you said about concatenation, I'm guessing that your sequence file was changed (by concatenating new files onto it) without re-indexing it (with esl-sfetch --index
), so the index file was mismatched with the FASTA file. There isn't a good way of testing for the index file being up to date with the sequence file without adding extra machinery that doesn't seem worth the effort. In this case, another solution is to make a new index for the file.
Hi! < Reposting a post I did in Biostars > I am using an extra utility of HMMER: esl-sfetch, which retrieves parts of fastas based on asked coordinates (denoted by a -c FROM..TO) and sequence name, like this:
esl-sfetch -c FROM..TO input.fasta sequence name
Now, I'm finding it impossible to use FROM coordinates over 60. Bellow you can see a successful example followed by an unsuccessful one:
$ /home/src/hmmer-3.2.1/easel/miniapps/esl-sfetch -c 60..1015 allRepeats.fasta rnd-6_family-6587#Unknown
$ /home/src/hmmer-3.2.1/easel/miniapps/esl-sfetch -c 61..1015 allRepeats.fasta rnd-6_family-6587#Unknown
Is this a bug? I noticed 60 is the number of characters before a new line in the fasta output. So it's not dealing well with new lines.
Thank you in advance,
Ricardo