gavinsimpson / gratia

ggplot-based graphics and useful functions for GAMs fitted using the mgcv package
https://gavinsimpson.github.io/gratia/
Other
202 stars 28 forks source link

Encountering errors with difference_smooths() - potential bug? #285

Closed AHugh8 closed 1 month ago

AHugh8 commented 5 months ago

I am running the following model:

modGI <- gam(Met1 ~ s(Timepoint, k=7) + s(Timepoint, by=subject, k=7, m=1) + s(subject, bs="re"), data=df, method="REML")

smooths(modGI)

"s(Timepoint)" "s(Timepoint):subject1" "s(Timepoint):subject3" "s(Timepoint):subject4" "s(Timepoint):subject5" "s(Timepoint):subject6" "s(Timepoint):subject7" "s(Timepoint):subject8" "s(subject)"

I am coming across some errors, but unsure why they're appearing.

1) diff_sms <- difference_smooths(modGI, smooth="s(Timepoint)")

Error in combn(levels(data[[by_var]]), 2) : n < m

2) attempting to compare the smooths of certain subjects e.g.,

check1 <- difference_smooths(modGI, smooth= c("s(Timepoint):subject1", "s(Timepoint):subject2"), partial_match=FALSE)

Error in pmap():ℹ In index: 9. Caused by error in Xp[r1, ] - Xp[r2, ]: ! non-conformable arrays

A snippet of my data (df) is as follows:

structure(list(subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), levels = c("1", "3", "4", "5", "6", "7", "8", "10", "11", "12", "16", "17", "19", "21", "22", "25", "26", "28", "35", "36", "37", "38", "39", "40", "42", "43", "44", "45", "46", "48", "50", "51", "52", "55", "56", "57", "58", "61", "65", "71"), class = "factor"), Timepoint = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L ), Met1 = c(0.012461309, 0.01794298, 0.003863203, -0.006398583, 0.001760362, 0.000870293, 0.000830637, -0.014258656, -0.01946732, 0.00389311, 0.000199752, -0.0122629, 0.007792022, 0.007019363, 0.006641794, 0.012690831, 0.004881705, -0.004663227, 0.002341043, -0.008827367, -0.009172269, 0.038058982, -0.002204742, -0.003488286, 0.004832194, -0.00137924, -0.016501518, 0.006870124, 0.011860637, 0.005993393, -0.004560137, -0.007139575, 0.000783146, 0.026054272, 0.000494855, 0.008204079, -0.011764013, -0.006389326, 0.008017244, 0.004591451, 0.012983199, 0.010973351, 0.005136924, 0.014092113, 0.002394753, 0.022453531, 0.012407455, 0.001519177, 0.000155242 )), row.names = c(NA, 49L), class = "data.frame")

Would be grateful if you could please take a look at this when you have the chance? Thank you

gavinsimpson commented 4 months ago

Thanks @AHugh8 I'm having trouble reproducing this with a similar local example:

# for issue #285
su_eg4 <- data_sim("eg4", n = 400, dist = "normal", scale = 2, seed = 1)
su_m_factor_by_re <- gam(y ~ s(fac, bs = "re") + s(x2) +
    s(x2, by = fac, m = 1) + s(x0),
  data = su_eg4, method = "REML"
)

su_m_factor_by_re |>
  difference_smooths(select = "s(x2)")

which works for me with the CRAN version of gratia (v 0.9.0)

Can you edit the issue to include the output from sessionInfo() so I can see what versions of packages you are running.

Alternatively, if you can send me your data privately (I will delete it as soon as I identify the problem) I will see if I can replicate the issue. Failing that, I'll need a reproducible example to make progress. And other than that, make sure you have updated all your packages and are running in a clean sesison.

gavinsimpson commented 1 month ago

I'm going to close this as "can't reproduce". It can be reopened if a reproducible example is forthcoming.