bbsBayes / bbsBayes2

An R Package for Hierarchical Bayesian Analysis of North American Breeding Bird Survey Data (second iteration)
https://bbsbayes.github.io/bbsBayes2/
Other
6 stars 3 forks source link

Error from run_model, before model is run by Stan #81

Closed tmeeha closed 1 month ago

tmeeha commented 1 month ago

Hello Adam and Brandon,

Wondering if you know what this error is about. It comes from run_model. Just FYI, this is my first go with bbsBayes2. I think (?) it is setup correctly. I followed you directions about running it on linux.

Thanks, Tim

s <- stratify(by = "bbs_cws", species = "American Dipper") _Using 'bbscws' (standard) stratification Loading BBS data... Filtering to species American Dipper (7010) Stratifying data...
Combining BCR 7 and NS and PEI...
Renaming routes...
p <- prepare_data(s) map <- load_map("bbs_cws") sp <- prepare_spatial(p, map) Preparing spatial data... Identifying neighbours (non-Voronoi method)... Formating neighbourhood matrices... Plotting neighbourhood matrices... print(sp$spatial_data$map) pm <- prepare_model(sp, model = "first_diff", model_variant="spatial", use_pois=FALSE) m <- run_model(md, iter_sampling = 100, iter_warmup = 500, chains = 2) _Model executable is up to date! Error in strsplit(pathmetadata, " ", fixed = TRUE)[[1]] :
subscript out of bounds

 

AdamCSmithCWS commented 1 month ago

Thanks @tmeeha .

I ran the following code and it's working at my end. Is it possible that in the last line of code, you've used an object "md" that happened to exist in your working environment, when you intended to use the object "pm"?

s <- stratify(by = "bbs_cws", species = "American Dipper")
p <- prepare_data(s)
map <- load_map("bbs_cws")
sp <- prepare_spatial(p, map)
print(sp$spatial_data$map)
pm <- prepare_model(sp, model = "first_diff", model_variant="spatial", use_pois=FALSE)
## note pm object in next line.
m <- run_model(pm, iter_sampling = 100, iter_warmup = 500, chains = 2)
tmeeha commented 1 month ago

Hmm. Still getting the error in a clean environment.

m <- run_model(pm, iter_sampling = 100, iter_warmup = 500, chains = 2) _Model executable is up to date! Error in strsplit(pathmetadata, " ", fixed = TRUE)[[1]] : subscript out of bounds

I wonder if I messed up the WSL install. Here is a check on that.

cmdstanr::check_cmdstan_toolchain() The C++ toolchain required for CmdStan is setup properly!

AdamCSmithCWS commented 1 month ago

I knew it couldn't be that simple :-)

Hmm. It sounds like something going wrong in the in the final data-prep stages (i.e., in R, not in Stan or cmdstanr). What version of R and what system are you running on?

tmeeha commented 1 month ago

Yup. I just went to this page -- https://mc-stan.org/cmdstanr/articles/cmdstanr.html -- to test my Stan setup and everything works with the cmdstanr examples.

R version 4.4.1 (2024-06-14 ucrt) -- "Race for Your Life" Platform: x86_64-w64-mingw32/x64 Windows 10 but Stan is setup to run with WSL Ubuntu bbsBayes2 version 1.1.1 cmdstanr version 0.8.0 cmdstan version 2.35.0

AdamCSmithCWS commented 1 month ago

With a bit more exploration, I see you're correct that the error is coming from the cmdstanr package, within one of its utility functions that checks the wsl cmdstan path.
https://github.com/stan-dev/cmdstanr/blob/master/R/path.R https://github.com/stan-dev/cmdstanr/blob/master/R/utils.R

I haven't been able to reproduce it yet, but I'll keep working on it.

AdamCSmithCWS commented 1 month ago

One thought, can you try this workaround?

library(cmdstanr)
library(bbsBayes2)

s <- stratify(by = "bbs_cws", species = "American Dipper")
p <- prepare_data(s)
map <- load_map("bbs_cws")
sp <- prepare_spatial(p, map)
print(sp$spatial_data$map)
pm <- prepare_model(sp, model = "first_diff", model_variant="spatial", use_pois=FALSE)

model_data <- pm$model_data
model_data[["test"]] <- 1
model_data[["train"]] <- as.integer(1:model_data[["n_counts"]])
model_data[["n_test"]] <- 1
model_data[["n_train"]] <- model_data[["n_counts"]]
model_data[["calc_CV"]] <- 0

copy_model_file(model = "first_diff", model_variant="spatial",getwd())

model <- cmdstanr::cmdstan_model("first_diff_spatial_bbs_CV_COPY.stan")

model_fit <- model$sample(data = model_data,
  chains = 2,
  iter_sampling = 100,
  iter_warmup = 500)
tmeeha commented 1 month ago

I might have found the problem. When my working directory lives on BOX (the file sharing cloud software), then cmdstanr::cmdstan_model() can't find the Stan file, even when I give it the full path. When the working directory is local, it can. So I can fix this on my end by not using a BOX folder as my working directory. Sorry for the wild goose chase!

AdamCSmithCWS commented 1 month ago

I'm going to close this issue, but thanks for bringing this up. Working with Stan "at-scale" (large datasets, many datasets, many models running in parallel, complex network structures, etc.) is often tricky.