lessthanoptimal / ddogleg

Java numerics library for optimization, polynomial root finding, sorting, robust model fitting, and more.
http://ddogleg.org
49 stars 18 forks source link

KMeans InitializePlusPlus.selectNextSeed throws RuntimeException: This shouldn't happen #3

Closed pcmoen closed 7 years ago

pcmoen commented 7 years ago

Thanks for your great libraries BoofCV and DDogleg!

The seeding of KMeans clusters using the KMeans ++ initializer will given certain data set throw a RuntimeException with the message "This shouldn't happen".

This will happens on various sized data sets. I guess this is more prevalent on small data sets. My data points are from the SIFT descriptor in BoofCV.

The code example KMeansFailsSeeding.txt will trigger this exception.

The following is the stack trace of the exception.

Exception in thread "main" java.lang.RuntimeException: This shouldn't happen
    at org.ddogleg.clustering.kmeans.InitializePlusPlus.selectNextSeed(InitializePlusPlus.java:97)
    at org.ddogleg.clustering.kmeans.InitializePlusPlus.selectSeeds(InitializePlusPlus.java:75)
    at org.ddogleg.clustering.kmeans.StandardKMeans_F64.process(StandardKMeans_F64.java:140)
    at KMeansFailsSeeding.main(KMeansFailsSeeding.java:30)
lessthanoptimal commented 7 years ago

Fixed. What was happening is that each point was duplicated. You requested 10 seeds but there was not that many unique points. That edge case had not been handled or even how it should behave defined. That's all been fixed SNAPSHOT of ddogleg. Thanks for reporting.