gavinsimpson / gratia

ggplot-based graphics and useful functions for GAMs fitted using the mgcv package
https://gavinsimpson.github.io/gratia/
Other
202 stars 28 forks source link

`draw.gam()` errors with 2d smooths fit on matrixes (`plot.gam()` and `smooth_estimates()` work) #116

Closed Aariq closed 2 years ago

Aariq commented 2 years ago

This is maybe an uncommon way to parameterize GAMs, so this is probably an edge case. If the dimensions in, say te() are each matrixes, then draw.gam() errors when other methods still work.

Reprex:

library(mgcv)
#> Loading required package: nlme
#> This is mgcv 1.8-36. For overview type 'help("mgcv-package")'.
library(gratia)
library(tsModel) # for Lag()
#> Time Series Modeling for Air Pollution and Health (0.6)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following object is masked from 'package:nlme':
#> 
#>     collapse
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(ggplot2)

df <- data_sim(1, n = 1000, dist = "normal", scale = 1, seed = 2)

# create matrix-columns for distributed-lag non-linear model:
df2 <- 
  df %>% 
  mutate(f_lag = Lag(f, 1:5),
         lag = matrix(1:5, ncol = 5)) %>% 
  filter(!is.na(f_lag[,5]))

# fit DLNM GAM
m3 <- gam(y ~ te(f_lag, lag), data = df2, method = "REML")

# draw errors
draw(m3)
#> Error: Aesthetics must be either length 1 or the same as the data (995): x and y

# mgcv::plot.gam() works
plot(m3)

# so does this:
smooth_estimates(m3) %>% 
  ggplot(aes(x = f_lag, y = lag, fill = est, z = est)) +
  geom_raster() +
  geom_contour()

Created on 2021-09-22 by the reprex package (v2.0.1)

gavinsimpson commented 2 years ago

Definitely not common but something I want to handle. If smooth_estimates() can generate the right output then it should be trivial to fix draw() to handle it correctly; I'm, probably just making a wrong assumption somewhere about what the tensor smooth plotting code should handle...

Aariq commented 2 years ago

My guess is the problem comes about when trying to plot the raw data points. Maybe a data = argument to that geom_point() layer would fix it. Alternatively, you could just not allow plotting of data points when the model frame contains things other than vectors. mgcv has a few places where when you try to do something you ordinarily can do, but the dimensions of te() are matrixes, it just gives an informative error.

gavinsimpson commented 2 years ago

That's 100% right, the processing that attaches the raw data points for the rug plots wasn't expecting matrices in the input data. I've fixed this code to flatten the elements of the input data just like {mgcv} treats them. I should have a fix pushed today once I've finished running through all the unit tests to make sure this hasn't broken anything else.

Aariq commented 2 years ago

Thanks for the quick fix!