apcamargo / genomad

geNomad: Identification of mobile genetic elements
https://portal.nersc.gov/genomad/
Other
168 stars 17 forks source link

Fasta contains multiple entries with the same identifier #44

Closed jsming1996 closed 8 months ago

jsming1996 commented 8 months ago

Hello, I got a error as " con1.fa is either empty or contains multiple entries with the same identifier. Please check your input FASTA file and execute genomad annotate again." I noticed that genomad could used for metagenomic data in the website, however when I try to put this (con1.fa) as the input file, it doesn't work. In my mind, the identifier of every read could be repeat. Should I change all the identifiers? could you provide a advice to solve this problem? I will really really appreciate about this. Best wishes!

apcamargo commented 8 months ago

Hi @jsming1996

You can use SeqKit to solve that:

seqkit rename con1.fa > con1_no_duplicates.fa