asardaes / dtwclust

R Package for Time Series Clustering Along with Optimizations for DTW
https://cran.r-project.org/package=dtwclust
GNU General Public License v3.0
252 stars 29 forks source link

Problem in converting data frame into a list of time series to tsclust #65

Closed Leprechault closed 9 months ago

Leprechault commented 9 months ago

I'd like to use time series clustering using the dtwclust package. The problem is the conversion of my data.frame to list of time series. All my blocks ID (named STAND) has 180 days in negative values (DATE_TIME) The B2_MAX is my variable response. In my example:

    library(dplyr)
    library(ggplot2)
    library(dtwclust)

    all.B2_MAX.stands <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_ts_data.csv")

    all.B2_MAX.tsc <-  all.B2_MAX %>%
      group_by(STAND) %>%
      summarise(var = list(B2_MAX[order(DATE_TIME)]), 
                var_ts = purrr::map(var, ts))

    clusters <- tsclust(all.B2_MAX.tsc[-1], 
                       type="partitional", 
                       k=2L, 
                       distance="dtw",
                       centroid = "pam")

    #plot
    plot(cluster, type = "sc")

    #Error in lapply(series, base::as.numeric) : 
    #  'list' object cannot be coerced to type 'double'

Please, any help with it?

Leprechault commented 9 months ago
d <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_ts_data.csv")
l <- split(d$B2_MAX,d$STAND)
o <- tsclust(l, 
        type="partitional", 
        k=2L, 
        distance="dtw_basic",
        centroid = "pam")
#plot
plot(o)
o
#partitional clustering with 2 clusters
#Using dtw_basic distance
#Using pam centroids

#Time required for analysis:
# usuário   sistema decorrido 
#     1.13      0.00      0.16 

#Cluster sizes with average intra-cluster distance:

#  size       av_dist
# 1   14 3.518299e+198
# 2   50  4.526561e+08