Closed Fla1487 closed 8 months ago
FASTA is a file like this:
>id1
AAACCCGGG
>id2
CCCGGAA
Each sequence must have a unique identifier in order to be uniquely identified.
https://en.wikipedia.org/wiki/FASTA_format:
The description line (defline) or header/identifier line, which begins with ">", gives a name and/or a unique identifier for the sequence,
What is Ridom
?
Hi @Fla1487,
I'm not sure why Riddom is naming sequences like that, it seems highly unconventional. As Slava said above AMRFinderPlus requires a unique sequence identifier to be able to identify the sequences and make sure it can report what gene/point mutation came from what contig and where.
If you just want to get results, here's a perl one-liner to append a number to each identifier to make sure they're unique:
perl -pe 's/>(\w+)/">$1" . ++$i/e' file.fasta > file.unique_ids.fasta
You could then run AMRFinderPlus on file.unique_ids.fasta
.
Hope that helps, Arjun
FASTA is a file like this:
>id1 AAACCCGGG >id2 CCCGGAA
Each sequence must have a unique identifier in order to be uniquely identified.
https://en.wikipedia.org/wiki/FASTA_format:
The description line (defline) or header/identifier line, which begins with ">", gives a name and/or a unique identifier for the sequence,
What is
Ridom
?
Thank you for you replay. RidomSeqSphere is a GUI. In the past I used to analyze .fasta files, but now I ahve noted this problem that it is absent when I produce .fasta files with spades/unicycler by using command line.
I have noted the differences in header identifiers, as well as a row between each contig and a format of the sequence.
Thank you again
Hi @Fla1487,
I'm not sure why Riddom is naming sequences like that, it seems highly unconventional. As Slava said above AMRFinderPlus requires a unique sequence identifier to be able to identify the sequences and make sure it can report what gene/point mutation came from what contig and where.
If you just want to get results, here's a perl one-liner to append a number to each identifier to make sure they're unique:
perl -pe 's/>(\w+)/">$1" . ++$i/e' file.fasta > file.unique_ids.fasta
You could then run AMRFinderPlus on
file.unique_ids.fasta
.Hope that helps, Arjun
Thank you Arjun, Now amrfinderplus works without problem. Actually, I do not know why Ridom generates fasta files with this format (with the previous version I worked without problems).
thank you for your help
Dear All, I have noted that AMRFinderPlus has a problem when applied to .fasta files where each contigs is named with the same identifiers (derived from Ridom). Below an example:
This problem does not exist when .fasta file show (derived from unicycler):
Do you have any suggestions?
Many thanks