Closed fluhus closed 1 year ago
It should just work, I don't think you need to change parameters. Just call easy-linclust
or easy-cluster
on your nucleotide input.
MMseqs2 does have issues with long sequences and internally splits them, but for viruses it should work pretty well.
Thank you!
I ran it and noticed that it identified automatically that the dataset was nucleotides, so all good.
Hi, thanks for making this toolkit! I'm excited to start using it with my data.
I have a set of viral genomes that I would like to cluster. From the wiki and the paper, I understand that linclust by default runs a process that's optimized for protein sequences (using blosum64, kmer length..). Can it run on nucleotide sequences? What would be the way to go about it?