asardaes / dtwclust

R Package for Time Series Clustering Along with Optimizations for DTW
https://cran.r-project.org/package=dtwclust
GNU General Public License v3.0
252 stars 29 forks source link

Tuning k to improve the model #58

Closed Leprechault closed 2 years ago

Leprechault commented 2 years ago

Hi Everyone!!

I'd like to find the optimal k values clusters for my time series. Is there any tool to perform hyperparameter tuning for spatio-temporal k-means clustering using dtwclust package?

Thanks in advance!

asardaes commented 2 years ago

Hello, I'm not sure if such a thing already exists, but I know that latrend has dtwclust as suggested, so maybe that package has the tools you need? I've never used it, but it might be helpful.

Leprechault commented 2 years ago

Here's an example of how to specify a KmL method, to define it for 1 to 5 clusters, estimate the list of definitions, and then plot the metric to identify the desirable number of clusters.

library(latrend)
data(latrendData)
# define KmL
method = lcMethodKML(response = 'Y')
methods = lcMethods(method, nClusters = 1:5)
# fit the specified methods
models = latrendBatch(methods, data = latrendData)

plotMetric(models, 'RSS')

# select best model by minimizing the criterion (not recommended)
bestModel = min(models, 'RSS')

# preferably, assess and select the best model manually
bestModel = models[[2]]
# or
bestModel = subset(models, nClusters == 2, drop = TRUE)

plot(bestModel)

Thanks a lot Niek Den Teuling for your help.