asardaes / dtwclust

R Package for Time Series Clustering Along with Optimizations for DTW
https://cran.r-project.org/package=dtwclust
GNU General Public License v3.0
258 stars 29 forks source link

The score.clus function(s) did not execute successfully, compare_clusterings, pick and scores both NULL #55

Closed vidarsumo closed 2 years ago

vidarsumo commented 2 years ago

I tested the examples in the reference manual for compare_clusterings() but with my own data and I get NULL in picks and scores.

# Fuzzy preprocessing: calculate autocorrelation up to 50th lag
acf_fun <- function(series, ...) {
  lapply(series, function(x) {
    as.numeric(acf(x, lag.max = 50, plot = FALSE)$acf)
  })
}
# Define overall configuration
cfgs <- compare_clusterings_configs(
  types = c("p", "h", "f", "t"),
  k = 19L:20L,
  controls = list(
    partitional = partitional_control(
      iter.max = 30L,
      nrep = 1L
    ),
    hierarchical = hierarchical_control(
      method = "all"
    ),
    fuzzy = fuzzy_control(
      # notice the vector
      fuzziness = c(2, 2.5),
      iter.max = 30L
    ),
    tadpole = tadpole_control(
      # notice the vectors
      dc = c(1.5, 2),
      window.size = 19L:20L
    )
  ),
  preprocs = pdc_configs(
    type = "preproc",
    # shared
    none = list(),
    zscore = list(center = c(FALSE)),
    # only for fuzzy
    fuzzy = list(
      acf_fun = list()
    ),
    # only for tadpole
    tadpole = list(
      reinterpolate = list(new.length = 205L)
    ),
    # specify which should consider the shared ones
    share.config = c("p", "h")
  ),
  distances = pdc_configs(
    type = "distance",
    sbd = list(),
    fuzzy = list(
      L2 = list()

    ),
    share.config = c("p", "h")
  ),
  centroids = pdc_configs(
    type = "centroid",
    partitional = list(
      pam = list()
    ),
    # special name 'default'
    hierarchical = list(
      default = list()
    ),
    fuzzy = list(
      fcmdd = list()
    ),
    tadpole = list(
      default = list(),
      shape_extraction = list(znorm = TRUE)
    )
  )
)

num_configs <- sapply(cfgs, attr, which = "num.configs")
cat("\nTotal number of configurations without considering optimizations:",
    sum(num_configs),
    "\n\n")

vi_evaluators <- cvi_evaluators("valid")
score_fun <- vi_evaluators$score
pick_fun <- vi_evaluators$pick

require(doParallel)
registerDoParallel(cl <- makeCluster(detectCores()))
comparison_long <- compare_clusterings(CharTraj, types = c("p", "h", "f", "t"),
                                       configs = cfgs,
                                       seed = 293L, trace = TRUE,
                                       score.clus = score_fun,
                                       pick.clus = pick_fun,
                                       return.objects = TRUE)

stopCluster(cl); registerDoSEQ()

comparison_long$pick

Here, comparison_long$pick is NULL.

I get the following warning message:

Warning message: In compare_clusterings(CharTraj, types = c("p", "h", "f", "t"), : The score.clus function(s) did not execute successfully: 'arg' should be one of “MPC”, “K”, “T”, “SC”, “PBMF”, “RI”, “ARI”, “VI”, “NMIM”, “valid”, “internal”, “external”

R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

Random number generation:
 RNG:     L'Ecuyer-CMRG 
 Normal:  Inversion 
 Sample:  Rejection 

locale:
[1] LC_COLLATE=Icelandic_Iceland.1252  LC_CTYPE=Icelandic_Iceland.1252    LC_MONETARY=Icelandic_Iceland.1252 LC_NUMERIC=C                      
[5] LC_TIME=Icelandic_Iceland.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] AzureKeyVault_1.0.5 AzureStor_3.5.2     doParallel_1.0.16   iterators_1.0.13    foreach_1.5.1       dtwclust_5.5.6.9000 dtw_1.22-3         
 [8] proxy_0.4-26        sumots_0.1.0        zoo_1.8-9           timetk_2.6.2        tictoc_1.0.1        lubridate_1.8.0     forcats_0.5.1      
[15] stringr_1.4.0       dplyr_1.0.7         purrr_0.3.4         readr_2.1.1         tidyr_1.1.4         tibble_3.1.6        ggplot2_3.3.5      
[22] tidyverse_1.3.1    

loaded via a namespace (and not attached):
  [1] readxl_1.3.1        backports_1.4.1     workflows_0.2.4     plyr_1.8.6          splines_4.1.1       listenv_0.8.0       digest_0.6.29      
  [8] htmltools_0.5.2     yardstick_0.0.9     parsnip_0.1.7.900   fansi_0.5.0         magrittr_2.0.1      tune_0.1.6.9000     cluster_2.1.2      
 [15] tzdb_0.2.0          remotes_2.4.2       recipes_0.1.17      globals_0.14.0      modelr_0.1.8        gower_0.2.2         RcppParallel_5.1.4 
 [22] xts_0.12.1          askpass_1.1         hardhat_0.1.6       rsample_0.1.1       prettyunits_1.1.1   dials_0.0.10.9000   colorspace_2.0-2   
 [29] rvest_1.0.2         rappdirs_0.3.3      ggrepel_0.9.1       haven_2.4.3         callr_3.7.0         crayon_1.4.2        jsonlite_1.7.2     
 [36] bigmemory.sri_0.1.3 survival_3.2-11     glue_1.5.1          AzureRMR_2.4.3      gtable_0.3.0        ipred_0.9-12        pkgbuild_1.3.0     
 [43] future.apply_1.8.1  scales_1.1.1        DBI_1.1.1           Rcpp_1.0.7          xtable_1.8-4        clue_0.3-60         GPfit_1.0-8        
 [50] stats4_4.1.1        lava_1.6.10         prodlim_2019.11.13  httr_1.4.2          ellipsis_0.3.2      modeltools_0.2-23   pkgconfig_2.0.3    
 [57] nnet_7.3-16         dbplyr_2.1.1        utf8_1.2.2          tidyselect_1.1.1    rlang_0.4.12        DiceDesign_1.9      reshape2_1.4.4     
 [64] later_1.3.0         munsell_0.5.0       cellranger_1.1.0    tools_4.1.1         cli_3.1.0           generics_0.1.1      broom_0.7.10       
 [71] fastmap_1.1.0       processx_3.5.2      fs_1.5.2            future_1.23.0       mime_0.12           AzureGraph_1.3.2    xml2_1.3.3         
 [78] flexclust_1.4-0     compiler_4.1.1      rstudioapi_0.13     curl_4.3.2          reprex_2.0.1        lhs_1.1.3           stringi_1.7.6      
 [85] ps_1.6.0            RSpectra_0.16-0     lattice_0.20-44     Matrix_1.3-4        nloptr_1.2.2.3      shinyjs_2.0.0       vctrs_0.3.8        
 [92] pillar_1.6.4        lifecycle_1.0.1     furrr_0.2.3         bigmemory_4.5.36    httpuv_1.6.4        R6_2.5.1            promises_1.2.0.1   
 [99] parallelly_1.29.0   codetools_0.2-18    MASS_7.3-54         assertthat_0.2.1    openssl_1.4.5       rprojroot_2.0.2     withr_2.4.3        
[106] hms_1.1.1           grid_4.1.1          AzureAuth_1.3.3     rpart_4.1-15        timeDate_3043.102   class_7.3-19        pROC_1.18.0        
[113] shiny_1.7.1 
asardaes commented 2 years ago

I imagine the documentation could use some clarification. Using cvi_evaluators won't always work if you mix fuzzy and non-fuzzy clustering; you'll notice that the cvi function has an argument to differentiate. The reason the examples work is not explicitly clear, in one case "VI" is used, and there is a version for both fuzzy and crisp. In the other case, external CVIs are used, which implicitly convert the fuzzy partition into a crisp one. Passing "valid" as type isn't really valid if you want to do fuzzy and crisp in one go.

vidarsumo commented 2 years ago

Ok I see, I'll try to leave fuzzy out for the moment and run this again.

asardaes commented 2 years ago

I've updated the documentation, please reopen if this was not enough.