How I know how many seed that I should fill

asardaes / dtwclust

R Package for Time Series Clustering Along with Optimizations for DTW

https://cran.r-project.org/package=dtwclust

GNU General Public License v3.0

252 stars 29 forks source link

How I know how many seed that I should fill #35

Closed duangrux closed 5 years ago

duangrux commented 5 years ago

Dear @asardaes In below the code, How I know how many seeds that I should fill. and Is seed function suitable for all clustering algorithm such as hierarchical, partitional and fuzzy? I just know the definition of seed that is the random seed for reproducibility.

pc_dtw <- tsclust(data_z, k = 6L,
                  distance = "dtw_basic", centroid = "dba",
                  trace = TRUE, **seed = 8,**
                  norm = "L2", window.size = 20L,
                  args = tsclust_args(cent = list(trace = TRUE)))

asardaes commented 5 years ago

You only need to provide 1 seed for any scenario. Partitional and fuzzy clustering do need it if you want reproducible results. Hierarchical itself is not subject to randomness, but preprocessing or centroid functions may be (e.g. DBA). If in doubt, provide a seed.

duangrux commented 5 years ago

You only need to provide 1 seed for any scenario. Partitional and fuzzy clustering do need it if you want reproducible results. Hierarchical itself is not subject to randomness, but preprocessing or centroid functions may be (e.g. DBA). If in doubt, provide a seed.

Dear @asardaes Thank you for your reply, I have other question I would like to evaluation each scenario following this code,

sapply(list(HC=hc, DTW = pc_dtw, DTW = pc_dtw, kShape = pc_sbd, TADPole = pc_tp),
       cvi, b = data_z[1L:200L], type = "internal")

and the result is error, how do I solov the code

Error in cvi(a@cluster, b = b, type = type[which_external], ...) : 
  (list) object cannot be coerced to type 'integer'

asardaes commented 5 years ago

What cvi expects in b are data labels, not the data itself.

duangrux commented 5 years ago

What cvi expects in b are data labels, not the data itself.

It's work! and may I ask for next, which method of cvi is the best for comparing clustering algorithm or each scenario?

asardaes commented 5 years ago

That's something I cannot answer, it depends on too many things, some of which I am not familiar with.

duangrux commented 5 years ago

That's something I cannot answer, it depends on too many things, some of which I am not familiar with.

alright, and thank you for your alway support :)