lbehnke / hierarchical-clustering-java

Implementation of an agglomerative hierarchical clustering algorithm in Java. Different linkage approaches are supported.
141 stars 79 forks source link

pushing up the speed #4

Closed alexmasselot closed 10 years ago

alexmasselot commented 10 years ago

Hi Nice piece of code, thanks:) However, I hit a speed limit which was really order of magnitude slower than R. I'm not catching up with R but improved by a factor 100 in time for 500 sized clusters. Mainly by making a map to kill the loop in findByCluster. I've added this test ClusterPerfTest, which is not a test, but print timings

If ever I cannot stand the time the perf for larger clusters, I guess the solution will be to move all that to scala and parallelize it.

thanks again for the contrib Alex

ORIG

Running com.apporiented.algorithm.clustering.ClusterPerfTest

cluster.size time.ms 2 3 4 0 8 5 16 25 32 56 64 240 128 3253 256 51018 512 969193

Heap size 760M

NEW

cluster.size time.ms 2 3 4 1 8 2 16 11 32 36 64 56 128 131 256 726 512 10167

Heap size 270M

lbehnke commented 10 years ago

Thanks for your optimization.