lbehnke / hierarchical-clustering-java

Implementation of an agglomerative hierarchical clustering algorithm in Java. Different linkage approaches are supported.
141 stars 79 forks source link

How can i generate a linkage matrix like scipy or matlab does #22

Open KIC opened 7 years ago

KIC commented 7 years ago

i. e. matlab gives me:

pkg load statistics
distances = [ 0, 1, 9, 7, 11, 14 ; 1, 0, 4, 3, 8, 10 ; 9, 4, 0, 9, 2, 8 ; 7, 3, 9, 0, 6, 13 ; 11, 8, 2, 6, 0, 10 ; 14, 10, 8, 13, 10, 0 ]
z=linkage(distances, 'single', 'euclidean')
z =

    3.0000    5.0000    6.4031
    1.0000    2.0000    8.2462
    4.0000    8.0000    9.5917
    7.0000    9.0000   13.0767
    6.0000   10.0000   16.4012

or scipy gives me:

import scipy.cluster.hierarchy as sch,random,numpy as np,pandas as pd

names = ["O1", "O2", "O3", "O4", "O5", "O6"]
distances = [ [ 0, 1, 9, 7, 11, 14 ], [ 1, 0, 4, 3, 8, 10 ], [ 9, 4, 0, 9, 2, 8 ], [ 7, 3, 9, 0, 6, 13 ], [ 11, 8, 2, 6, 0, 10 ], [ 14, 10, 8, 13, 10, 0 ]]
x=pd.DataFrame(distances,columns=names)
link=sch.linkage(x,'single')
print link
[[  2.           4.           6.40312424   2.        ]
 [  0.           1.           8.24621125   2.        ]
 [  3.           7.           9.59166305   3.        ]
 [  6.           8.          13.07669683   5.        ]
 [  5.           9.          16.40121947   6.        ]]

With the java version I am not sure how I can construct such a linkage matrix because it gives me:

package scratch;

import com.apporiented.algorithm.clustering.Cluster;
import com.apporiented.algorithm.clustering.ClusteringAlgorithm;
import com.apporiented.algorithm.clustering.DefaultClusteringAlgorithm;
import com.apporiented.algorithm.clustering.SingleLinkageStrategy;

public class Clustering {

    public static void main(String[] args) {
        String[] names = new String[] { "O1", "O2", "O3", "O4", "O5", "O6" };
        double[][] distances = new double[][] {
                { 0, 1, 9, 7, 11, 14 },
                { 1, 0, 4, 3, 8, 10 },
                { 9, 4, 0, 9, 2, 8 },
                { 7, 3, 9, 0, 6, 13 },
                { 11, 8, 2, 6, 0, 10 },
                { 14, 10, 8, 13, 10, 0 }};

        ClusteringAlgorithm alg = new DefaultClusteringAlgorithm();
        Cluster cluster = alg.performClustering(distances, names, new SingleLinkageStrategy());
        cluster.toConsole(4);
    }

}

        clstr#5  distance: distance : 8.00, weight : 6.00
          O6 (leaf)  distance: distance : 0.00, weight : 1.00
          clstr#4  distance: distance : 4.00, weight : 5.00
            clstr#2  distance: distance : 2.00, weight : 2.00
              O3 (leaf)  distance: distance : 0.00, weight : 1.00
              O5 (leaf)  distance: distance : 0.00, weight : 1.00
            clstr#3  distance: distance : 3.00, weight : 3.00
              O4 (leaf)  distance: distance : 0.00, weight : 1.00
              clstr#1  distance: distance : 1.00, weight : 2.00
                O1 (leaf)  distance: distance : 0.00, weight : 1.00
                O2 (leaf)  distance: distance : 0.00, weight : 1.00