We currently support generating clustering results using a range of parameters with the sweep_clusters() function, but the functions in calculate-clusters.R only support calculating metrics for one set of clustering results. In order to make the plots described in #9 it might be helpful to have a function that calculates one or all the metrics on the clustering output from sweep-clusters().
I think this would take the following arguments:
List of data frames with clustering results using different clustering parameters output from sweep_clusters().
Metric(s) to calculate. This could be a list that specifies which metrics to calculate. For example, providing c("purity", "width") would run both calculate_silhouette() and calculate_purity() on all data frames/ clustering results. Alternatively we could use flags for each metric, width, purity, and stability.
The output would be a list of data frames with one data frame for each metric. That means there would be one data frame that contains all the results from the purity calculations for all clustering results that were output from sweep_clusters(), one for width, and one for stability. Then these data frames could be provided as input to the function for plotting described in #9.
We currently support generating clustering results using a range of parameters with the
sweep_clusters()
function, but the functions incalculate-clusters.R
only support calculating metrics for one set of clustering results. In order to make the plots described in #9 it might be helpful to have a function that calculates one or all the metrics on the clustering output fromsweep-clusters()
.I think this would take the following arguments:
sweep_clusters()
.c("purity", "width")
would run bothcalculate_silhouette()
andcalculate_purity()
on all data frames/ clustering results. Alternatively we could use flags for each metric, width, purity, and stability.The output would be a list of data frames with one data frame for each metric. That means there would be one data frame that contains all the results from the purity calculations for all clustering results that were output from
sweep_clusters()
, one for width, and one for stability. Then these data frames could be provided as input to the function for plotting described in #9.