pkimes / sigclust2

tests for statistical significance of clustering
35 stars 6 forks source link

Use of Gower distances #8

Open kurpav00 opened 2 years ago

kurpav00 commented 2 years ago

Hello,

Would it be possible to modify the shc function to also handle Gower distances? Currently, shc only accepts numeric matrices, but Gower distances are calculated from a data frame with mixed column types, which is currently not possible as an input to shc. To illustrate:

> frame <- data.frame(var1=c(1,2,3,4,5,6),var2=c(0.1,0.2,0.3,0.4,0.5,0.6),var3=c("a","a","a","b","b","b"))
> frame$var3 <- as.factor(frame$var3)
> shc(frame,n_min = 3)
Error in shc(frame, n_min = 3) : 
  x must be a matrix; use as.matrix if necessary
> shc(as.matrix(frame),n_min = 3)
Error in colMeans(x) : 'x' must be numeric
In addition: Warning message:
In dist(x, method = metric, p = l) : NAs introduced by coercion
> shc(as.matrix(frame),n_min = 3,matmet = function(x){cluster::daisy(x,metric = "gower")})
 Error in cluster::daisy(x, metric = "gower") : 
x is not a dataframe or a numeric matrix. 

Thank you in advance.