MKLab-ITI / multimedia-indexing

A framework for large-scale feature extraction, indexing and retrieval.
Apache License 2.0
59 stars 19 forks source link

Sum of within cluster distances is always 0.0 #9

Closed futurely closed 9 years ago

futurely commented 9 years ago

It takes too much time to compute the sum of within cluster distances which always displays as 0.0. Disabling it would speed up the learning process.

Thu Apr 23 10:40:05 CST 2015: Iter 1 Sum of within cluster distances: 0.0
Thu Apr 23 10:41:06 CST 2015: Iter 2 Sum of within cluster distances: 0.0
Thu Apr 23 10:41:39 CST 2015: Iter 3 Sum of within cluster distances: 0.0
Thu Apr 23 10:42:11 CST 2015: Iter 4 Sum of within cluster distances: 0.0
Thu Apr 23 10:42:40 CST 2015: Iter 5 Sum of within cluster distances: 0.0
Thu Apr 23 10:43:05 CST 2015: Iter 6 Sum of within cluster distances: 0.0
Thu Apr 23 10:43:28 CST 2015: Iter 7 Sum of within cluster distances: 0.0
Thu Apr 23 10:43:50 CST 2015: Iter 8 Sum of within cluster distances: 0.0
Thu Apr 23 10:44:11 CST 2015: Iter 9 Sum of within cluster distances: 0.0
Thu Apr 23 10:44:31 CST 2015: Iter 10 Sum of within cluster distances: 0.0
Thu Apr 23 10:44:52 CST 2015: Iter 11 Sum of within cluster distances: 0.0
Thu Apr 23 10:45:11 CST 2015: Iter 12 Sum of within cluster distances: 0.0
Thu Apr 23 10:45:32 CST 2015: Iter 13 Sum of within cluster distances: 0.0
Thu Apr 23 10:45:52 CST 2015: Iter 14 Sum of within cluster distances: 0.0
Thu Apr 23 10:46:13 CST 2015: Iter 15 Sum of within cluster distances: 0.0
lefman commented 9 years ago

Thanks for this. The output was always 0.0 only when the clustering was run using multiple slots (threads). Anyway, the SimpleKMeansWithOutput class was mainly added to support early development. We have now switched to the original SimpleKMeans class of Weka that does not display any output by default. We have also turned fast distance calculation on.