kogalur / randomForestSRC

DOCUMENTATION:
https://www.randomforestsrc.org/
GNU General Public License v3.0
113 stars 18 forks source link

how to predict new data in unsupervised forest #385

Closed mytarmail closed 11 months ago

mytarmail commented 11 months ago

how to predict new data

library(randomForestSRC)
m <- matrix(rnorm(1000),ncol = 5)
o <- sidClustering(m, k = 5)
o$clustering

predict??

ishwaran commented 11 months ago

Hi, the sidClustering function is designed for clustering and not prediction. If you want to fit a multivariate RF for prediction purposes, then you should use the rfsrc function. See the following vignette for some examples of how to use multivariate RF

mytarmail commented 11 months ago

Hi, the sidClustering function is designed for clustering and not prediction. If you want to fit a multivariate RF for prediction purposes, then you should use the rfsrc function. See the following vignette for some examples of how to use multivariate RF

Hello! I understand that this is a function for clustering, but in the vast majority of clustering algorithms there is a prediction method, this is a normal practice, so I thought that it should be here. In any case, thanks for the reply.

ishwaran commented 11 months ago

Yes, that's true that you can do prediction with many clustering algorithms. The issue here is that sidClustering does a special processing of the X-features before it fits a random forest model. Because of this, it's difficult (but not impossible) to set it up to do prediction. I will look into this and maybe in the future put this option in. Thanks for your input.