I want to cluster a large dataset of DNA sequences. Must I first convert my fasta file into a DB format file? As is written here: https://github.com/soedinglab/MMseqs2/wiki#linclust, or can I use my fasta file directly? As is written here on the GitHub page.
You can use mmseqs easy-linclust with FASTA file, or convert your FASTA file into a DB file by mmseqs createdb and then cluster it by mmseqs linclust .
Hi,
I want to cluster a large dataset of DNA sequences. Must I first convert my fasta file into a DB format file? As is written here: https://github.com/soedinglab/MMseqs2/wiki#linclust, or can I use my fasta file directly? As is written here on the GitHub page.
What is the best approach here?
Kind regards, Marlies