Open fweber144 opened 3 years ago
There's another issue with as.matrix.projection() for GAMs visible here (the smooth_sd parameters) which I'll open a new issue for.
This is issue #151 now.
This has to do with the way we split the original fit object's formula to do the forward search. We defined some conventions that seemed reasonable but they may give raise to stuff like this. In this case, for instance, it was decided that we also want to project the main effect of the smoothing term, which serves as a way of asking "do I need the smoothing term or is the linear term enough in terms of predictive accuracy?".
In this case, for instance, it was decided that we also want to project the main effect of the smoothing term, which serves as a way of asking "do I need the smoothing term or is the linear term enough in terms of predictive accuracy?".
That definitely makes sense, but I think we need to differentiate between the (forward) search and the projection onto a pre-specified submodel. My concern here is the latter.
In any case, a submodel shouldn't contain both, the nonsmoothed term (e.g. mom_iq
) as well as the smoothed term (e.g. s(mom_iq)
), right? For example, if you fit a model with both of these terms in rstanarm, you'll get a lot of convergence issues. So including both terms leads to a degenerate model. Therefore, I think projpred should catch that case (when projecting onto a pre-specified submodel) and either
EDIT: An earlier version referred to mgcv where fitting a model with a nonsmoothed as well as a smoothed term would give 0 reference degrees of freedom for the smoothed term and an F-statistic as well as a p-value which are NaN
. However, that doesn't seem to be necessarily the case, see e.g.:
library(mgcv)
set.seed(2)
dat <- gamSim(1, n = 40)
b <- gam(y ~ x2 + s(x2), data = dat)
summary(b)
When debugging the projection of the first draw in the reprex above, however, we do get 0 reference degrees of freedom for the smoothed term and an F-statistic as well as a p-value which are NaN
.
As far as I remember if the smooth is included first the linear term can’t be added, but if the linear term is added first the smooth can be included later. So a model with both could exist and from the modeling perspective it could make sense in the right model, but I agree they have to be properly constrained and so on. Probably @
Get Outlook for iOShttps://aka.ms/o0ukef
From: Frank Weber @.>
Sent: Monday, July 5, 2021 6:46:39 PM
To: stan-dev/projpred @.>
Cc: Alejandro Catalina @.>; Comment @.>
Subject: Re: [stan-dev/projpred] as.matrix.projection()
for GAMs: Duplicated linear term (#150)
In this case, for instance, it was decided that we also want to project the main effect of the smoothing term, which serves as a way of asking "do I need the smoothing term or is the linear term enough in terms of predictive accuracy?".
That definitely makes sense, but I think we need to differentiate between the (forward) search and the projection onto a pre-specified submodel. My concern here is the latter.
In any case, a submodel shouldn't contain both, the nonsmoothed term (e.g. mom_iq) as well as the smoothed term (e.g. s(mom_iq)), right? For example, if you fit a model with both of these terms in rstanarm, you'll get a lot of convergence issues. And if you fit such a model in mgcv, you'll get 0 reference degrees of freedom (Ref.df in summary.gam()'s output) for the smoothed term and an F-statistic as well as a p-value which are NaN. So including both terms leads to a degenerate model. Therefore, I think projpred should catch that case (when projecting onto a pre-specified submodel) and either
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/stan-dev/projpred/issues/150#issuecomment-874203871, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABZ5FH3COG6A7T6SVMV5T4TTWHHV7ANCNFSM46Q6WWEQ.
The models I have tested with rstanarm and mgcv had the linear term included first, so that order didn't help.
Interestingly, if the linear term appears earlier and the smooth appears in solution terms that means that the projection actually performs better with both. We may need to add diagnostics that catch these potential issues in the projections.
Sorry, I don't understand what you mean by
the linear term appears earlier and the smooth appears in solution terms
Could you explain that?
Yes.
The way we coded the forward search implies that the only case where both the smooth and the linear term would appear in the solution is if the linear term is chosen first, indicating that it is a good predictor even without the smooth. Now it is possible that the smooth adds extra predictive power and therefore be chosen as part of the solution as well. I will check if this is indeed the case, that is, if including the smooth first actually removes the linear term. If this is the case, then what may happen is that the smooth may or may not be part of the optimal subset of predictors, and in this case it is fully possible for the projection including both terms to be unstable and have convergence issues, which is something we can add for improvement.
Yes, that's the search behavior I would have expected, although I think the linear term needs to be removed whenever the smoothed term is included, not only
if including the smooth first actually removes the linear term
Anyway, as I said above, I'm only referring to project()
here in this issue, not to the search. I don't know if there are similar issues with the search, but I think it's already within project()
that projpred should catch the case where the user supplies both (the linear as well as the smoothed term) to solution_terms
.
In any case, a submodel shouldn't contain both, the nonsmoothed term (e.g. mom_iq) as well as the smoothed term (e.g. s(mom_iq)), right?
It depends on whether s() term includes a linear term inside. There may have been confusion as I had thought s() would not contain the linear term, but based on this comment it does. I've also checked this at some point, but can't remember the details and the mgcv documentation doesn't say it clearly.
Yes, that's true. The mgcv documentation is a bit scarce with regard to the mathematical details of the smoothing terms (e.g., s()
). However, ?mgcv::gam.models
and the beginning of chapter 18 of "The R book" by Michael Crawley make me think that an additional linear term is not necessary when using s()
(so the typical usage of s()
would be s(x)
and not s(x) + x
).
The example at the beginning of section 5.2 of Wood (2006), together with the notation introduced earlier in the book, makes it clear that an additional linear term is not needed.
References:
Wood, S. N. (2006). Generalized additive models: An introduction with R (1st ed.). Chapman & Hall/CRC.
On branch
develop
(commit 96d8e11d6f0ad0a1f7836c51700cb676b364bee5), this reprex:gives
(so the output matrix includes a linear term for the smoothed predictor) whereas
(correctly) gives
Perhaps the problem is not really with
as.matrix.projection()
, but rather withproject()
which would need to project onto the submodel with the smoothed term only (and not an additional linear term), even ifsolution_terms = c("mom_iq", "s(mom_iq)")
?There's another issue with
as.matrix.projection()
for GAMs visible here (thesmooth_sd
parameters) which I'll open a new issue for.