pbs-assess / sdmTMB

:earth_americas: An R package for spatial and spatiotemporal GLMMs with TMB
https://pbs-assess.github.io/sdmTMB/
184 stars 26 forks source link

`predict()` makes R session crash in dev branch when passing `newdata` #100

Closed kellijohnson-NOAA closed 2 years ago

kellijohnson-NOAA commented 2 years ago

Something changed between 066338b2ada5fb748455e317375adcbcdefe4ccb and c57ac95 in the dev branch that is leading to termination of my R session when I run predict(fit, newdata = ). It will terminate if I run the previous after fitting the model or if I use predict_args inside of sdmTMB::sdmTMB().

seananderson commented 2 years ago

Is it possible this is with a model saved from an older version? I think somewhere in there Eric added an extra TMB data element (stan_flag).

This currently runs for me with the latest 'dev':

library(sdmTMB)
mesh <- make_mesh(pcod_2011, c("X", "Y"), cutoff = 20)
fit <- sdmTMB(
  density ~ s(depth),
  data = pcod_2011, mesh = mesh,
  family = tweedie(link = "log")
)
p <- predict(fit)
head(p)
#> # A tibble: 6 × 17
#>    year     X     Y depth density present   lat   lon depth_mean depth_sd
#>   <int> <dbl> <dbl> <dbl>   <dbl>   <dbl> <dbl> <dbl>      <dbl>    <dbl>
#> 1  2011  435. 5718.   241    245.       1  51.6 -130.       5.16    0.445
#> 2  2011  487. 5719.    52      0        0  51.6 -129.       5.16    0.445
#> 3  2011  490. 5717.    47      0        0  51.6 -129.       5.16    0.445
#> 4  2011  545. 5717.   157      0        0  51.6 -128.       5.16    0.445
#> 5  2011  404. 5720.   398      0        0  51.6 -130.       5.16    0.445
#> 6  2011  420. 5721.   486      0        0  51.6 -130.       5.16    0.445
#> # … with 7 more variables: depth_scaled <dbl>, depth_scaled2 <dbl>,
#> #   `_sdmTMB_time` <int>, est <dbl>, est_non_rf <dbl>, est_rf <dbl>,
#> #   omega_s <dbl>
p <- predict(fit, newdata = qcs_grid)
head(p)
#>     X    Y    depth depth_scaled depth_scaled2 year _sdmTMB_time       est
#> 1 456 5636 347.0834    1.5608122    2.43613479 2003            0 -4.726672
#> 2 458 5636 223.3348    0.5697699    0.32463771 2003            0  2.342465
#> 3 460 5636 203.7408    0.3633693    0.13203724 2003            0  3.087509
#> 4 462 5636 183.2987    0.1257046    0.01580166 2003            0  3.878558
#> 5 464 5636 182.9998    0.1220368    0.01489297 2003            0  4.020912
#> 6 466 5636 186.3892    0.1632882    0.02666303 2003            0  4.050893
#>   est_non_rf      est_rf     omega_s
#> 1  -4.567419 -0.15925278 -0.15925278
#> 2   2.368309 -0.02584346 -0.02584346
#> 3   2.979943  0.10756586  0.10756586
#> 4   3.637582  0.24097517  0.24097517
#> 5   3.646527  0.37438449  0.37438449
#> 6   3.543100  0.50779380  0.50779380

Created on 2022-06-01 by the reprex package (v2.0.1)

seananderson commented 2 years ago

If that's the case, and you don't want to re-run things, you should just be able to add the extra bit of data:

fit$tmb_data$stan_flag <- 0L

There's usually a check for that in the main branch (incremented minor version number checked), but that hasn't been happening in the dev branch.

kellijohnson-NOAA commented 2 years ago

No, I was fitting the model when it crashed.

seananderson commented 2 years ago

Another major thing that happened was TMB was updated to 1.9.0 a few days ago on CRAN. It's possible you have a mismatch with sdmTMB built with an older version. Can you install the latest TMB and update sdmTMB dev? Also TMB 1.9.0 broke sdmTMB (and glmmTMB, ...) until I fixed something with how openmp() was called yesterday.

kellijohnson-NOAA commented 2 years ago

I am using TMB 1.9.0. But I installed sdmTMB and glmmTMB after installing TMB and prior to running anything. I can try again though.

seananderson commented 2 years ago

Then other than restarting the R session to make sure everything is loaded fresh, can you try running the example I posted above to see if that runs for you?

seananderson commented 2 years ago

Actually, if it's all the way back at https://github.com/pbs-assess/sdmTMB/commit/066338b2ada5fb748455e317375adcbcdefe4ccb and the above pcod example runs for you, then the most obvious culprit is the unique coordinate prediction, which was implemented slightly after https://github.com/pbs-assess/sdmTMB/commit/066338b2ada5fb748455e317375adcbcdefe4ccb

Perhaps there's something about the grid you're predicting on that is causing the unique locations to end up indexing out of bounds.

So, if it works with the pcod example, can you point me to an example to run?

kellijohnson-NOAA commented 2 years ago

I reinstalled the current dev branch and the example crashes the console at p <- predict(fit, newdata = qcs_grid).

seananderson commented 2 years ago

Very strange. Can you confirm that

remotes::install_github("pbs-assess/sdmTMB", ref = "066338b")

restarting R if necessary, and running the above example does work?

seananderson commented 2 years ago

Additional question: I assume you're on R 4.2.0? Maybe post a sessionInfo()?

kellijohnson-NOAA commented 2 years ago
> library(sdmTMB)
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sdmTMB_0.0.24.9009

loaded via a namespace (and not attached):
 [1] compiler_4.1.2   Matrix_1.4-0     assertthat_0.2.1 generics_0.1.2  
 [5] cli_3.3.0        mgcv_1.8-39      splines_4.1.2    nlme_3.1-155    
 [9] grid_4.1.2       lifecycle_1.0.1  rlang_1.0.2      lattice_0.20-45 
seananderson commented 2 years ago

It works for me with the latest packages on R 4.1.2 on a Mac, so it's not just the R version...

OK, I've confirmed the crash on Windows R 4.1.2 and TMB 1.9.0.

seananderson commented 2 years ago

I can't figure out what's going on and I'm out of time right now, so I added a check in predict to use the old less efficient version if on Windows: https://github.com/pbs-assess/sdmTMB/commit/0f43cfcaf5e6d62b46a62b2a29ada8267cc5095e

This should avoid the crash for now, but it will use more time on the MakeADFun() calls because of all the duplicated random effects. (The more efficient code finds unique spatial locations and indexes those.)

kellijohnson-NOAA commented 2 years ago

@seananderson I confirmed that your fix works on my machine this morning, thank you. Let me know if there is any additional testing that I can do to help.

seananderson commented 2 years ago

There was a bug/typo where I was setting up some prediction arrays to be larger than they had to be and apparently that is fine with some compilers but crashes with others. Should be fixed now and all platforms can use the more efficient unique-prediction-grid approach. https://github.com/pbs-assess/sdmTMB/commit/d300f1f61aa7048a95fa3251c98227dddd7b84de