accord-net / framework

Machine learning, computer vision, statistics and general scientific computing for .NET
http://accord-framework.net
GNU Lesser General Public License v2.1
4.49k stars 1.99k forks source link

KMeans Dimension Weights #1681

Open GuyBenhaim opened 5 years ago

GuyBenhaim commented 5 years ago

Hello, Using KMeans I have added WeightedSquareEuclidean, in order to scale the values of the 3rd dimension, such that its effect is balanced against the first two dimensions. This does not seem to affect the clustering results. Is this feature working?

                    int K = 10;
                    // Create a new K-Means algorithm instance
                    KMeans kmeans = new KMeans(K);
                    {  // Distance weights the **"importance" (?)** of dimensions.
                        var Distance = new WeightedSquareEuclidean(new double[] { 1.0, 1.0, 120.0 });
                    }
SuperDaveWhite commented 5 years ago

Seems to work for me. I haven't tried but maybe its because you are giving numbers > 1.

I suggest setting up a playground to test the inputs to it. I did this to find the group numbers but I expect you can also set something up for the weighting. Since you are probably looking for the hockey stick to set the group numbers if you don't know from the problem.

        // Create a new K-Means algorithm
        var kmeans = new KMeans((int)numUpDownLearn.Value)
        {
            Tolerance = .0000005,
            Distance = new WeightedEuclidean(new double[] { .9, .05, .05, .9 }),
        };