paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.28k stars 187 forks source link

Functional regression in brms #346

Closed ecologics closed 6 years ago

ecologics commented 6 years ago

Function-valued regression where either the response or one of the predictor variables is a function has a variety of applications. Classic examples are fMRI data as a functional response, and characteristics of individuals are used as scalar predictors. There are several recent and exciting applications in evolutionary biology and ecology, suggesting that we may soon see some growth in these applications. See for example: http://onlinelibrary.wiley.com/doi/10.1111/jse.12252/full and http://onlinelibrary.wiley.com/doi/10.1111/ecog.02306/abstract

The FREE package for R (https://github.com/jdyen/FREE) implements both scalar-on-function and function-on-scalar forms using a purpose-built Gibbs sampler. Indeed, it seems to run extremely quickly. An earlier version of the software is also available through the repo (devtools::install_github("jdyen/FREE", ref = "original")) and although the author says it is unsupported, it apparently produces and runs Stan code for the function-on-scalar form. It is discussed in this paper: http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12290/abstract

I have tried to install the legacy version and am unable to make it run the Stan code (likely because it is not compatible with current versions of Stan). But it is probably possible to get it to generate the code without calling Stan and have it return the model.

If brms could incorporate this function-valued approach it may indeed be slower than the current purpose-built Gibbs sampler, but it would offer numerous other advantages (i.e. potentially incorporation of other model variants that brms is capable of).

paul-buerkner commented 6 years ago

Thanks! Upon further inspection, I believe that function-on-scalar regression (functional response), is already possible with brms by writing the data in long format. Consider a slightly simplified version of the example in the paper you referenced:

n <- 100
m <- -10:10
a <- sin(m / 4 + 8 / 5)
x <- rnorm(n, 0, 1)
b <- sin(m + pi / 3)
y <- matrix(a, nrow = n, ncol = length(m), byrow = TRUE) + 
  outer(x, b, FUN = "*") + rnorm(n, sd = 2)
dat <- data.frame(y = as.vector(y), x = x, m = rep(m, each = n))

library(brms)
theme_set(theme_default())
fit <- brm(y ~ s(m) + s(m, by = x), dat, chains = 2, cores = 2)
# will only work with brms >= 2.1.4
marginal_smooths(fit, int_conditions = list(x = 1))

This won't fit as fast as the specifically built Gibbs-sampler of course, but as you say can be combined with everything else brms offers.

Function-on-function regression is possible in the same way, but we will model m and x via a two-dimensional spline in this case: y ~ s(m, x)

With regard to scalar-on-function regression, things are a bit different, because we need to numerically integrate over the spline. Stan 2.18 will add a one-dimensional numerical integrator, which we might be able to use. Also, there seems to be a built in solution in mgcv (see ?mgcv::linear.functional.terms), but this may not yet work with brms, because s() needs to take matrices as input in this case, which may cause problems with brms' "data-is-required" approach. Also, the mgcv solution seems to be more involved from a user perspective, so not sure if this is really an approach I want to support.

ecologics commented 6 years ago

Very good! Yes, this works exactly as expected and it really fits quite speedily. Okay then! brms does function-on-scalar regression and function-on-function regression. Check.

The newer version of marginal_smooths() also works nicely and flexibly to produce the requisite plots..

Just to be clear if we had multiple scalar predictors in a function-on-scalar regression, would the approach be?

y ~ s(m) + s(m, by=x1) + s(m, by=x2) + ...

Scalar-on-function regression would be a fabulous future feature when or if numerical integration gets implemented in Stan.

Thanks Paul for all your efforts on this fabulous package.

paul-buerkner commented 6 years ago

y ~ s(m) + s(m, by=x1) + s(m, by=x2) + ...

yes this would be the way to go for multiple predictors.

paul-buerkner commented 6 years ago

I have to correct myself: Function-on-function regression is not simply implemented via y ~ s(m, x), There may be some built-in functionality within mgcv that I am not aware of (so there may already be an easier solution), but the following should work using brms' non-linear syntax, which can also be used to express function-on-scalar regression,

bf(y ~ a + b * x, a ~ s(m), b ~ s(m), nl = TRUE)
ecologics commented 6 years ago

Simon Wood in the second edition of Generalized Additive Models (2017) demonstrates scalar-on-function and function-on-scalar regression using mgcv. This is a handy reference. pp. 390-397. Is there any reason--in principle--that scalar-on-function should not also work in brms.

Wood uses the gas data here to demonstrate scalar-on-function in the following where octane is a vector and NIR is a matrix:

b <- gam(octane ~ s(nm, by=NIR, k=50), data=gas)

To work in brms presumably the matrix NIR would need to be unwound as in the above examples.

paul-buerkner commented 6 years ago

Thanks! There is no reason in principle just the "problem" of handling a variable (the NIR matrix in this case) that is not diretly part of the data. Of course, we could write it using cbind(x1, x2, ....) but this will be a long call I guess... Of course, I could just allow variables outside the data, but the problem would come back again when doing plotting and predictions. There should be a more principled approach to this.

Does the NIR matrix come with the gas data or where does he get it? That is, could you give me a reproducible example for scalar-on-function regression to play with myself?

ecologics commented 6 years ago

Here is a reproducible example from Simon Wood (2017) Generalized Additive Models, 2nd ed., p. 391, demonstrating scalar-on-function regression (scalar response regression) using the mgcv package.

library(mgcv)
install.packages("gamair") ## CRAN package associated with above book
library(gamair)
data(gas)

b <- gam(octane ~ s(nm, by=NIR, k=50), data=gas)

plot(b, scheme=1, col=1)

The gas object is as follows:


 $ octane: num  85.3 85.2 88.5 83.4 87.9 ...
 $ NIR   : AsIs [1:60, 1:401] -0.0502 -0.0442 -0.0469 -0.0467 -0.0509 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr  "1" "2" "3" "4" ...
  .. ..$ : chr  "900 nm" "902 nm" "904 nm" "906 nm" ...
 $ nm    : num [1:60, 1:401] 900 900 900 900 900 900 900 900 900 900 ...```
paul-buerkner commented 6 years ago

Thanks! I must admit, quite embarrisingly, that scalar-on-function regression seems to be already working with brms in the same way as in mgcv. I just wasn't aware of how data.frames can handle matrices as columns and that brms actually supports this sort of columns...

For your example, we can write

b2 <- brm(octane ~ s(nm, by=NIR, k=50), data=gas)

but I see some convergence warnings. The only post-processing method that needs some tweaks to support matrix columns seems to be marginal_effects. I will check it out more thorougly next week.

paul-buerkner commented 6 years ago

The post-processing does also work fine I think. For the above b2 model, you can run

marginal_smooths(b2, int_conditions = list(NIR = 1))

to see the estimate smooth function. I continue to be impressed by the functionality mgcv offers. Since every functional regression type turned out to be already working, are you ok with closing this issue?

ecologics commented 6 years ago

Indeed, mgcv is remarkable. Making it fit with Stan might not be as straightforward, though. Here is the model summary after setting max_treedepth=15, which is not deep enough.

  Links: mu = identity; sigma = identity 
Formula: octane ~ s(nm, by = NIR, k = 50) 
   Data: gas (Number of observations: 60) 
Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; 
         total post-warmup samples = 4000
    ICs: LOO = NA; WAIC = NA; R2 = NA

Smooth Terms: 
              Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sds(snmNIR_1)    50.20     12.38    29.93    78.23        125 1.03

Population-Level Effects: 
          Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
Intercept    87.60     13.15    62.23   112.37        457 1.00
snm:NIR_1     3.05      4.33    -5.30    11.72        103 1.01
snm:NIR_2    -3.70      5.26   -14.10     6.55        103 1.01

Family Specific Parameters: 
      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
sigma     0.19      0.02     0.16     0.24        379 1.01

How do we interpret the snm:NIR_1 and snm:NIR_2 parameters here?

paul-buerkner commented 6 years ago

Yes, this max_treedepth think is strange, but I don't think it stops the model from converging here to be honest. For instance, pp_check shows reasonable plots and bayes_R2 is at 97% or so. Also the marginal_smooth plot looks reasonable. So I think the model did converge. You might want to ask at the Stan forums (http://discourse.mc-stan.org/) if they have any idea, why every post-warmup sample exeedes the maximum treedepth.

In splines, the parameters are rarely interpretable at all and I assume it is the same with the present spline. snm:NIR_1 and snm:NIR_2 represent the linear components of the spline, which I expect to be rather not interpretable since we also have 50 non-linear coefficients, which all together form the spine, we see in the plots.

ecologics commented 6 years ago

I figured these parameters were not much use, as per splines in general.

Great Paul! Functional regression is functional in brms. I'll ask that question on the Stan forums. Shall I keep this issue open until there is some more resolution regarding the high number of treedepth warnings?

paul-buerkner commented 6 years ago

Great! I think you can savely close this issue now, as their is likely nothing to be changed from the brms side anyway to get rid of the convergence warnings. After all I am just passing matrices prepared by mgcv to Stan.