Closed zhaoc1 closed 12 months ago
Hi @zhaoc1 ,
I am not able to reproduce the error. You may share your fasta file with me by email or link, if it isn't very large.
Best, Carlos
Hi Carlos,
Thanks for the reply. The original input centroids.ffn
was multiple FASTA catted into one file. I tried to rerun eggnog with individual FASTAs instead, and I no longer encountered the same error 🤔
Anyway, I am closing this issue now.
Chunyu
Hi @zhaoc1 ,
Thank you very much for your feedback.
Best, Carlos
Actually, I located the problem FASTA input (attached). I also attached the GNU Time log file.
My guess is that this sequence is not detected as CDS, because it doesn't start with ATG:
GUT_GENOME287165_01908 TGA...
So the CDS will be empty. You could try using a different translation table, or translating them yourself to proteins and using --itype protein. I am not sure if UGA is a start codon anywhere...
Best, Carlos
It makes sense. Looking back to the prokka annotation of "GUT_GENOME287165_01908", it is "23S ribosomal RNA (partial)". Thanks Carlos.
Ah if it is a rRNA it makes sense yes. We should add code that checks whether the CDS are non-empty. Sorry for the inconveniences. Best, Carlos
Hi,
Here is my emapper command:
I got an error message like the following:
I doubled check the input
centroids.ffn
is valid FASTA and GUT_GENOME287165_01908 does have sequences in the input FASTA. So the error message seems to indicate missing sequences for GUT_GENOME287165_01908 at some intermediate steps. Any ideas what's going on here? Thank you!Chunyu