James-Thorson-NOAA / VAST

Spatio-temporal analysis of univariate or multivariate data, e.g., standardizing data for multiple species or stages
http://www.FishStats.org
GNU General Public License v3.0
124 stars 53 forks source link

Model crashes when e_i not equal to c_iz #375

Closed quang-huynh closed 1 year ago

quang-huynh commented 1 year ago

I have a multi-category model and wanted to specify an error structure e_i that is different from the default, where each category has its own variance. However the TMB model crashes and I'm not able to deduce what's going on, as n_e and ObsModel_ez seem to be correctly specified from make_data.

Below is a simple reproducible example:

library(VAST)

x <- runif(100)
Lat <- runif(100)
Lon <- runif(100)
y <- 3 * x + rnorm(100)
z <- 2 * x + rnorm(100)

DF <- rbind(
  data.frame(b = y, Lat = Lat, Lon = Lon, c_iz = 0),
  data.frame(b = z, Lat = Lat, Lon = Lon, c_iz = 1)
)

settings <- FishStatsUtils::make_settings(
  n_x = 20,
  Region = "user",
  purpose = "ordination",
  fine_scale = TRUE,
  n_categories = 2,
  max_cells = 20
)

extrapolation_list <- FishStatsUtils::make_extrapolation_info(
  Region = "user",
  input_grid = cbind(Lat = Lat, Lon = Lon, Area_km2 = 1),
  Save_results = FALSE
)

spatial_list <- FishStatsUtils::make_spatial_info(
  n_x = settings$n_x,
  Lon_i = DF$Lon,
  Lat_i = DF$Lat,
  Extrapolation_List = extrapolation_list,
  Method = "Mesh",
  Save_Results = FALSE,
  fine_scale = TRUE,
  nstart = 1
)

data_list <- VAST::make_data(
  b_i = DF$b,
  a_i = rep(1, nrow(DF)),
  t_i = rep(2022, nrow(DF)),
  c_iz = DF$c_iz,
  e_i = c(rep(0, 25), rep(1, 125)), # Does not crash if: e_i = DF$c_iz
  FieldConfig = settings$FieldConfig,
  OverdispersionConfig = settings$OverdispersionConfig,
  RhoConfig = settings$RhoConfig,
  VamConfig = settings$VamConfig,
  ObsModel_ez = c(1, 0),
  spatial_list = spatial_list,
  Options = settings$Options,
  Aniso = TRUE,
  CheckForErrors = TRUE,
  Version = "VAST_v14_0_1"
)

tmb_list <- VAST::make_model(
  TmbData = data_list,
  Method = "Mesh",
  RhoConfig = settings$RhoConfig,
  Version = "VAST_v14_0_1",
  loc_x = spatial_list$loc_x
)
James-Thorson-NOAA commented 1 year ago

Thanks for your interest! I think having the ability to flexibly assign error distributions to different observations is one advantage to using a more flexible interface than is available in glm/gam frameworks.

please replace e_i = c(rep(0, 25), rep(1, 175)) and it should work. I'll add an informative error message checking the length of e_i. I'm closing for now but please report back with any further issue