asardaes / dtwclust

R Package for Time Series Clustering Along with Optimizations for DTW
https://cran.r-project.org/package=dtwclust
GNU General Public License v3.0
253 stars 29 forks source link

Can I define certain series to always be the clustering centers? #3

Closed Wang-Yu-Qing closed 8 years ago

Wang-Yu-Qing commented 8 years ago

Hey guys! I come across with one question. When using dtwclust with the method 'patitional', am I able to have some series to be the centeriod throughout the whole process? For example, I have 1000 series,in which there are s1,s2 and s3. I want to use dtwclust to do patitional clustering, and the outcome should be three clusts with s1,s2 and s3 to be their centroid respectively. Is this possible? Thank you for your attention! ^_^

asardaes commented 8 years ago

Well it cannot be done with dtwclust, but you could simply do something like this:

library(proxy)

distance_matrix <- proxy::dist(x = non_centroid_series_list, y = centroid_series_list, 
                               method = chosen_distance)

membership_indices <- apply(distance_matrix, 1L, which.min)
Wang-Yu-Qing commented 8 years ago

@asardaes Thanks a lot. But If I use 'dist()' like you mentioned above, the distance between series will not be computed by DTW.

asardaes commented 8 years ago

It will if chosen_distance is "DTW" or "DTW2". You can register many distances with proxy, and dtwclust registers all its distances there. Check the following:

library(dtwclust)
summary(pr_DB) # pr_DB is the proxy database

See the documentation of the proxy package for more information on what it does and how to use it.

Wang-Yu-Qing commented 8 years ago

@asardaes Thank you , I will have a try later!

Wang-Yu-Qing commented 8 years ago

Where can i find the differences between these two distance measures 'DTW' and 'DTW2'? I read the guidiance of package 'proxy' but there is no such a discription about it.

asardaes commented 8 years ago

Those two distances are registered into proxy by dtw and dtwclust. DTW is the "normal" one, you can read the documentation of the dtw package for that one. DTW2 is explained in the documentation of dtwclust, it is the same DTW algorithm but using L2 norm for the local cost matrix.