SugiharaLab / rEDM

Applications of Empirical Dynamic Modeling from Time Series
Other
117 stars 43 forks source link

Error in INTERNAL_EmbedDimension(pathIn, dataFile, dataFrame, pathOut, : FindNeighbors(): Library is too small to resolve 2 knn neighbors. #38

Closed desertnaut closed 4 years ago

desertnaut commented 4 years ago

I try to use the package to analyze & predict the sunspot monthly data in rolling windows. Here is the fully reproducible code:

library(rEDM)

df <- data.frame(yr = as.numeric(time(sunspot.month)), 
                 sunspot_count = as.numeric(sunspot.month))

# make indices for 11 rolling splits

train_splits <- rep(NA, 11)
test_splits <- rep(NA, 11)

periods_train <- 12 * 50 # 50 yrs
periods_test  <- 12 * 10 # 10 yrs
skip_span     <- 12 * 20 # 20 yrs

for (k in 1:11) {
  train_start <- (k-1)*skip_span + 1
  train_stop <- train_start + periods_train -1
  test_start <- train_stop + 1
  test_stop <- test_start + periods_test -1

  train_splits[k] <- paste(as.character(train_start), as.character(train_stop))
  test_splits[k] <- paste(as.character(test_start), as.character(test_stop))
}

# END make indices

# Try embeddings & predictions

k = 1

E.opt = EmbedDimension( dataFrame = df,                # input data
                        lib     = train_splits[k],     # portion of data to train
                        pred    = test_splits[k],      # portion of data to predict
                        columns = "sunspot_count",
                        target  = "sunspot_count")

# works OK with k = 1-3
# for k > 3, fails with:

# Error in INTERNAL_EmbedDimension(pathIn, dataFile, dataFrame, pathOut,  : 
#   FindNeighbors(): Library is too small to resolve 2 knn neighbors.

It works OK with k = 1, 2, 3, but for larger values of k (it goes up to 11), it fails with the subject error message.

I wonder, since the size of the library is the same for every split (600 data points), and it works OK with the first 3 splits, why is this happening?

I have tried it with the just released v1.2.0, as well as with the previous version 1.1.0 - same behavior.

Session info:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252    LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                            LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rEDM_1.2.0

loaded via a namespace (and not attached):
[1] compiler_3.6.1 tools_3.6.1    Rcpp_1.0.3   
desertnaut commented 4 years ago

Despite no response, checking with the latest rEDM version 1.3.7 shows that the issue is now resolved, so I'm closing this.