Hey @lbhenke, thanks for this implemmentation. It worked nicely on small datasets, but it undergone heap issues with larger datasets. I'm not sure the entire problem could be solved by this commit, however it drops distances size from n*n to n*(n-1)/2, since distances is a symmetrical matrix whose main diagonal is irrelevant.
It incorporates the strategy used by MATLAB's pdist function, and I also kept it a one-lined matrix so I wouldn't need to change or add headers in the ClusteringAlgorithm interface.
Hey @lbhenke, thanks for this implemmentation. It worked nicely on small datasets, but it undergone heap issues with larger datasets. I'm not sure the entire problem could be solved by this commit, however it drops
distances
size fromn*n
ton*(n-1)/2
, sincedistances
is a symmetrical matrix whose main diagonal is irrelevant.It incorporates the strategy used by MATLAB's
pdist
function, and I also kept it a one-lined matrix so I wouldn't need to change or add headers in theClusteringAlgorithm
interface.