vast-lib / tinyVAST

Expressive interface for multivariate spatio-temporal models
https://vast-lib.github.io/tinyVAST/
GNU General Public License v3.0
8 stars 2 forks source link

`s( Double, by=Factor, id=1)` #50

Open James-Thorson-NOAA opened 2 weeks ago

James-Thorson-NOAA commented 2 weeks ago

@seananderson Have you explored using s(..., id) to specify that smoothing terms are shared across factor smooths?

The concept arises in long-form implementation of covariate smooths in a multivariate model, where you might not wanna estimate the log-penalty separately for each species. I'm curious if you have thought about how to get that properly parsed in the sdmTMB linkage to mgcv

seananderson commented 2 weeks ago

Short answer is no.

It also appears that gamm4 errors out when using id, which isn't promising, since it uses the same approach and presumably Simon Wood would have included it if it were straightforward.

I suppose we could map the penalties to match, although from reading the mgcv docs in ?gam.models, id also ensures the smooths use the same basis functions... which I suppose could also be hacked on our end. Experiments with mapping below:

library(sdmTMB)
library(mgcv)
#> Loading required package: nlme
#> This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.

set.seed(10)
dat <- gamSim(4)
#> Factor `by' variable example
b2 <- gamm4::gamm4(y ~ fac+s(x2,by=fac)+s(x0),data=dat, REML = FALSE)
b3 <- sdmTMB(y ~ fac+s(x2,by=fac)+s(x0),data=dat, spatial = "off")
logLik(b2$mer)
#> 'log Lik.' -837.6935 (df=12)
logLik(b3)
#> 'log Lik.' -837.6935 (df=12)

b2 <- gamm4::gamm4(y ~ fac+s(x2,by=fac, id=1)+s(x0),data=dat, REML = FALSE)
#> Error in gamm4.setup(gp, pterms = pTerms, data = mf, knots = knots): gamm can not handle linked smoothing parameters (probably from use of `id' or adaptive smooths)
# silently fits the same as before:
b3 <- sdmTMB(y ~ fac+s(x2,by=fac, id=1)+s(x0),data=dat, spatial = "off")
# logLik(b2$mer)
logLik(b3)
#> 'log Lik.' -837.6935 (df=12)

b4 <- gam(y ~ fac+s(x2,by=fac)+s(x0),data=dat)
ggeffects::ggpredict(b4, c("x2 [all]", "fac")) |> plot()


# can we 'fake' it?
b5 <- sdmTMB(y ~ fac+s(x2,by=fac, id=1)+s(x0),data=dat, spatial = "off",
  control = sdmTMBcontrol(map = list(ln_smooth_sigma = factor(rep(1, 4)))))
#> ℹ Fixing or mirroring `ln_smooth_sigma`
logLik(b5)
#> 'log Lik.' -857.5791 (df=9)
b5
#> Model fit by ML ['sdmTMB']
#> Formula: y ~ fac + s(x2, by = fac, id = 1) + s(x0)
#> Mesh: NULL (isotropic covariance)
#> Data: dat
#> Family: gaussian(link = 'identity')
#>  
#>             coef.est coef.se
#> (Intercept)     1.20    0.17
#> fac2           -1.75    0.24
#> fac3            2.27    0.23
#> sx2):fac1       1.78    2.14
#> sx2):fac2      -4.33    1.92
#> sx2):fac3      -7.77    2.00
#> sx0             2.45    1.54
#> 
#> Smooth terms:
#>               Std. Dev.
#> sds(x2):fac1)     10.43
#> sds(x2):fac2)     10.43
#> sds(x2):fac3)     10.43
#> sds(x0)           10.43
#> 
#> Dispersion parameter: 1.87
#> ML criterion at convergence: 857.579
#> 
#> See ?tidy.sdmTMB to extract these values as a data frame.

ggeffects::ggpredict(b3, c("x2 [all]", "fac")) |> plot()

ggeffects::ggpredict(b5, c("x2 [all]", "fac")) |> plot()


# but I guess these have their own basis functions still...
# could fix knot locations at least?

b6 <- sdmTMB(y ~ fac+s(x2,by=fac, id=1)+s(x0),data=dat, spatial = "off",
  control = sdmTMBcontrol(map = list(ln_smooth_sigma = factor(rep(1, 4)))),
  knots = list(x2 = seq(min(dat$x2), max(dat$x2), length.out = 10)))
#> ℹ Fixing or mirroring `ln_smooth_sigma`
logLik(b6)
#> 'log Lik.' -861.555 (df=9)
ggeffects::ggpredict(b6, c("x2 [all]", "fac")) |> plot()

Created on 2024-06-13 with reprex v2.1.0