Open dfrankow opened 4 years ago
I'll attach my logs so far, why not. It's a bit opaque, tho.
Below are the log statements in diff form, and I attach two files:
1.log
from get_effect_variances(mtcars, model1)
2.log
from get_effect_variances(mtcars, model2)
The first obvious difference is i=2.
1.log has numer
=9.9e-08 and out
=0.99 (long line goes off the screen):
*** i 2 coeftemp 39.6862614802529 FUN -3.1909720389839 F0 -3.1909721389834 numer 9.99994953509997e-08 out[, i] 0.999994953509997
2.log has numer
=0 and out
=0:
*** i 2 coeftemp 39.686261480253 FUN -3.19097213898376 F0 -3.19097213898376 numer 0 out[, i] 0
It's as if FUN=F0 for some reason I can't quite see in 2.log.
Log statements:
diff --git a/R/gradient_factory.R b/R/gradient_factory.R
index 4238150..c601512 100644
--- a/R/gradient_factory.R
+++ b/R/gradient_factory.R
@@ -33,6 +34,7 @@ gradient_factory.default <- function(data, model, variables = NULL, type = "resp
# apply colMeans to get average marginal effects
means <- apply(me_tmp, 2L, stats::weighted.mean, w = weights, na.rm = TRUE)
}
+ write(paste("** FUN me_tmp", me_tmp, "means", means), file="the.log", append=TRUE)
means
}
return(FUN)
diff --git a/R/jacobian.R b/R/jacobian.R
index 2ac79aa..1298cb9 100644
--- a/R/jacobian.R
+++ b/R/jacobian.R
@@ -1,5 +1,7 @@
# function returns a Jacobian matrix (a term-by-beta matrix)
jacobian <- function(FUN, coefficients, weights = NULL, eps = 1e-7) {
+ write(paste("** jacob coeffs", coefficients, "weights", weights, "eps", eps),
+ file="the.log", append=TRUE)
F0 <- FUN(coefficients, weights = weights)
out <- matrix(NA_real_, nrow = length(F0), ncol = length(coefficients))
colnames(out) <- names(coefficients)
@@ -8,6 +10,12 @@ jacobian <- function(FUN, coefficients, weights = NULL, eps = 1e-7) {
coeftemp <- coefficients
coeftemp[i] <- coeftemp[i] + eps
out[, i] <- (FUN(coeftemp, weights = weights) - F0) / eps
+ write(paste("*** i", i, "coeftemp", coeftemp,
+ "FUN", FUN(coeftemp, weights = weights),
+ "F0", F0,
+ "numer", (FUN(coeftemp, weights = weights) - F0),
+ "out[, i]", out[, i]),
+ file="the.log", append=TRUE)
}
out
}
diff --git a/tests/testthat/tests-core.R b/tests/testthat/tests-core.R
index a58f26f..997fba2 100644
@@ -65,6 +65,16 @@ if (requireNamespace("ridge")) {
model2 <- linearRidge(mpg ~ wt + cyl, data = mtcars, lambda=0)
expect_true(inherits(m2 <- margins(model2), "margins"), label = "margins works for linearRidge")
+ # TODO: remove
+ source('R/jacobian.R')
+ source('R/gradient_factory.R')
+ source('R/find_terms_in_model.R')
+ source('R/get_effect_variances.R')
+ source('R/reset_coefs.R')
+ vars1 <- get_effect_variances(mtcars, model1)
+ vars2 <- get_effect_variances(mtcars, model2)
+
+ # margins(model2)
m1 <- margins(model1)
s1 <- summary(m1)
s2 <- summary(m2)
I found a possible factor.
reset_coefs
changes the model coefficients.
For lm
, this is simply setting a named vector.
For ridgeLinear
, it's not entirely obvious how to do it. It doesn't look set up to do it easily. I implemented nothing, so it was using the default method, which doesn't look right. The coef.ridgeLinear function returns a vector where the intercept is constructed out of three different pieces:
inter <- object$ym - scaledcoef %*% object$xm
So perhaps the margins code is changing the model values in a way that currently depends on information carried around with the model. This would have to be adapted to ridgeLinear, but exactly what is required here is not obvious to me.
Thanks for all your work - I'll try to figure out what needs to happen.
FYI, if something happened here, I think we'd use it.
We use margins to analyze results with heterogeneous treatment effects, and wish we had some reasonable shrinkage to potentially reduce false positives.
We are thinking about other ways to approach het effects as well.
This is a request for debugging assistance for a new feature, margins of ridge regression. If I can debug this, I'd be happy if you'd accept a pull request with the feature.
I am trying to modify your "predictions" and "margins" libraries to find the marginal effects of a ridge regression.
The test and output look like this:
Clearly computing the standard error is not going well for my ridgeLinear object.
It is entirely possible that I've modified the ridgeLinear object to have some bug, or that I've incorrectly implemented the interface for it in the "prediction" library. The place I've traced to so far (the "FUN" gradient function in margins) seems to perhaps use the parent environment for some of its values. I could have a scoping problem. I'm looking for suggestions of things to try.
Within my "margins", I traced it with this code in the RStudio debugger:
The jacobian is always all 0s for model2 (the linearRidge model), so perhaps the world looks flat and it doesn't find the variances properly.
I can keep dropping into the debugging rathole by diving into the marginal_effects within the gradient function FUN ("me_tmp <- gr.." near line 20 of gradient_factory.R), but I wondered if you had any ideas.
Thanks for your attention.
Output of
sessionInfo
: