inlabru-org / inlabru

inlabru
https://inlabru-org.github.io/inlabru/
73 stars 21 forks source link

implements new `probs` and `return.samples` options to `predict.bru` #143

Closed pierreroudier closed 2 years ago

pierreroudier commented 2 years ago

This PR implements two simple options for predict.bru:

Examples:

# Required for reproducible predict() and generate() output.
set.seed(1234L)

input.df <- data.frame(x = cos(1:10), zz = rep(c(1, 10), each = 5))
input.df <- within(input.df, y <- 5 + 2 * cos(1:10) +
  rnorm(10, mean = 0, sd = 1)[zz] +
  rnorm(10, mean = 0, sd = 0.1))

# Fit a model with fixed effect 'x' and intercept 'Intercept'
fit <- bru(y ~ x + z(zz, model = "iid", mapper = bru_mapper_index(10)), family = "gaussian", data = input.df)

For the probs option:

x1 <- predict(
  fit,
  data = NULL,
  formula = ~ x_latent + fit$summary.random$z$mean[1],
  n.samples = 50, 
  probs = seq(0, 1, by = 0.2), 
  seed = 12345L
)

Which gives:

> x1
      mean        sd       q0    q0.2     q0.4     q0.6     q0.8       q1     smin     smax
1 2.000632 0.0831341 1.843082 1.93589 1.978886 2.007566 2.052245 2.220725 1.843082 2.220725
          cv         var
1 0.04155393 0.006911278

Note that for a probability of 0.5, the column name is automatically changed to median (current behaviour):

x1.2 <- predict(
  fit,
  data = NULL,
  formula = ~ x_latent + fit$summary.random$z$mean[1],
  n.samples = 50, 
  probs = c(0.05, 0.5, 0.90), 
  seed = 12345L
)

Which gives:

> x1.2
      mean         sd    q0.05   median     q0.9     smin     smax         cv        var
1 1.998391 0.07135222 1.894265 1.996368 2.092928 1.843082 2.151538 0.03570485 0.00509114

The default behaviour is the same as the current behaviour:

x1.3 <- predict(
  fit,
  data = NULL,
  formula = ~ x_latent + fit$summary.random$z$mean[1],
  n.samples = 50, 
  seed = 12345L
)

Which gives:

> x1.3
      mean         sd   q0.025   median   q0.975     smin     smax         cv         var
1 1.998759 0.06906419 1.895441 1.988629 2.140935 1.873755 2.194334 0.03455353 0.004769863

For the return.samples option:

x2 <- predict(
  fit,
  data = NULL,
  formula = ~ x_latent + fit$summary.random$z$mean[1],
  n.samples = 50, 
  return.samples = TRUE, 
  seed = 12345L
)

Which gives:

> x2
  sample.1 sample.2 sample.3 sample.4 sample.5 sample.6 sample.7 sample.8 sample.9 sample.10
1 2.044666 2.004647  1.88739 2.040335 1.956185 2.056624 2.022956 1.991965 2.098645  2.044206
  sample.11 sample.12 sample.13 sample.14 sample.15 sample.16 sample.17 sample.18 sample.19
1  1.953144  2.014561  1.990395  1.912923  2.088756  1.888543    2.0334  2.125807  1.941713
  sample.20 sample.21 sample.22 sample.23 sample.24 sample.25 sample.26 sample.27 sample.28
1  2.172584  1.999395  1.865559  1.958225  2.000996   1.91859  1.891961  2.120751  2.154372
  sample.29 sample.30 sample.31 sample.32 sample.33 sample.34 sample.35 sample.36 sample.37
1  1.988764  1.921919  2.286053  1.897926  1.962047  1.983044  1.868498  1.979393  2.076378
  sample.38 sample.39 sample.40 sample.41 sample.42 sample.43 sample.44 sample.45 sample.46
1  2.020633  1.873755  1.974662  1.924925  1.969282  1.969128  1.959379  2.013215  1.948823
  sample.47 sample.48 sample.49 sample.50
1  2.045175  2.064974  2.057049  1.920725
ASeatonSpatial commented 2 years ago

I like this probs option, that is handy and saves me writing code to do it myself each time. However, is there a need for the return.samples option when generate() will return this with almost exactly the same arguments as the predict() call? Or is there some significant difference between what this returns and what generate() does?

finnlindgren commented 2 years ago

Thanks @pierreroudier ! The probs part is a good addition; I think I'll take this opportunity to make a non-backwards-compatible change and make 0.5 return q0.5 instead of median, to make the output more machine-readable (and also more like regular inla summary output), and remove some of the legacy outputs (I'll add an option to reproduce the old behaviour; we'll also re-add an old feature that provided the Monte Carlo uncertainties).

The samples aspects seems to just replicated the generate.bru output, and would be better left out of predict.bru.

The predict code can probably be cleaned up/reorganized (but that would be better done after merging the probs feature, as it's unrelated). It should ideally only need to do

... <- generate(...)
... <- bru_summarise(...)

but mostly for historical reasons it's a bit more messy than that; the main issue has been to create spatial object outputs. I've also planned to add a broom::tidy-like option.

I'll take a closer look at the PR and let you know and/or modify/merge.

pierreroudier commented 2 years ago

Thanks for the feedback @ASeatonSpatial and @finnlindgren !

You are absolutely right, I got carried away when implementing the probs option, and saw that (unused) cbind.only option in bru_summarise, and just went for it! Latest commit https://github.com/inlabru-org/inlabru/pull/143/commits/cf158f6378d8193229671069cc44472f25a857e2 removes return.samples, but keeps probs.

pierreroudier commented 2 years ago

I think I'll take this opportunity to make a non-backwards-compatible change and make 0.5 return q0.5 instead of median, to make the output more machine-readable (and also more like regular inla summary output)

Good idea I think, you'll see this is just a if statement to remove/alter: https://github.com/pierreroudier/inlabru/blob/cf158f6378d8193229671069cc44472f25a857e2/R/bru.inference.R#L1436-L1438

finnlindgren commented 2 years ago

I noticed from the checking action runs that the documentation needs some small fixes; note that we use the markdown-enabled roxygen style, so can/should use ... instead of \code{...}, and also need to "protect" [0,1], with [0,1] to prevent it from making it a link.

pierreroudier commented 2 years ago

I noticed from the checking action runs that the documentation needs some small fixes; note that we use the markdown-enabled roxygen style, so can/should use ... instead of \code{...}, and also need to "protect" [0,1], with [0,1] to prevent it from making it a link.

My bad -- corrected in https://github.com/inlabru-org/inlabru/pull/143/commits/c8405e861db9d2dd6bdfb1695f44fd885939b047