soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
MIT License
1.47k stars 200 forks source link

Can I define/specify the number of clusters in MMseqs2/Linclust? #593

Closed mrzResearchArena closed 2 years ago

mrzResearchArena commented 2 years ago

Context

I would like to cluster my dataset, which contains approximately 35,000 protein sequences. I need to make some clusters in terms of their superfamilies. Thus, I would like to set the number of clusters, for example, we can specify the number of clusters in the K-Means cluster algorithm.

Can I define the number of clusters (i.e., 10, 50, 250) using the MMseqs2/Linclust tool? Thank you!

Your Environment

mrzResearchArena commented 2 years ago

solved!

meghankane commented 1 year ago

Can you share the solution for this? I'm also looking to try specifying the number of clusters like in k-means. Thanks in advance.

PabloOfEpidemiology commented 1 year ago

How did you solve this?