K-mode - Githubissues

ma1f commented 3 years ago

I would like to auto cluster high dimensional binary (true/false) data, believe k-mode would be more appropriate then k-means for this scenario.

Are there plans to support further clustering algorithms, including k-mode?

michaelgsharp commented 3 years ago

@briacht for visibility.

justinormont commented 3 years ago

In practice, I think you'll get a similar result by running k-means.

Expanding beyond your boolean data to categorical data, there can be some speed and memory savings of using k-modes vs. k-means + one-hot encoding on categorical data. I haven't tried, though there's likely advantages in having the additional distance (dissimilarity) functions.

For future clustering algos in ML․NET, there is existing ML․NET/TLC code (1, 2) for OPTICS and DBSCAN, which could be brought into this repo.

dotnet / machinelearning

K-mode #5957