philliplab / yasss

Yet Another Short Sequence Simulator
0 stars 0 forks source link

Add reporting on within / between cluster distances to sim_proc_many_pops #153

Closed philliplab closed 6 years ago

philliplab commented 6 years ago
philliplab commented 6 years ago

Divide distance matrix into two cluster Report within cluster dist and between cluster dist

Also report the size of the smallest cluster

philliplab commented 6 years ago

Step 1: Add clara2 element into list returned by summarize_dmat

Use variation of this code: x <- many_pops$all_dmats[[1]]

z <- as.matrix(x$dmat) y <- clara(z, 2) cluster1 <- which(y$clustering==1) cluster2 <- which(y$clustering==2)

within_cluster <- c(as.vector(z[cluster1, cluster1]), as.vector(z[cluster2, cluster2])) between_cluster <- c(as.vector(z[cluster1, cluster2]), as.vector(z[cluster2, cluster1]))

mean(within_cluster) mean(between_cluster)

cluster_sizes <- c(length(cluster1), length(cluster2))