I would like to cluster my dataset, which contains approximately 35,000 protein sequences. I need to make some clusters in terms of their superfamilies. Thus, I would like to set the number of clusters, for example, we can specify the number of clusters in the K-Means cluster algorithm.
Can I define the number of clusters (i.e., 10, 50, 250) using the MMseqs2/Linclust tool? Thank you!
Your Environment
Server specifications (especially CPU support for AVX2/SSE and amount of system memory): AVX2
Context
I would like to cluster my dataset, which contains approximately 35,000 protein sequences. I need to make some clusters in terms of their superfamilies. Thus, I would like to set the number of clusters, for example, we can specify the number of clusters in the K-Means cluster algorithm.
Can I define the number of clusters (i.e., 10, 50, 250) using the MMseqs2/Linclust tool? Thank you!
Your Environment