Suggestions for "margins" of ridge regression?

This is a request for debugging assistance for a new feature, margins of ridge regression. If I can debug this, I'd be happy if you'd accept a pull request with the feature.

I am trying to modify your "predictions" and "margins" libraries to find the marginal effects of a ridge regression.

I modified the "ridge" library to work with factors and have vcov. (See version 2.5).
I modified your "prediction" library to use ridgeLinear predictions from that library. (See these diffs.)
I modified your "margins" library to test that it worked. It does not. See "Currently fails" in these diffs.

The test and output look like this:

> model1 <- lm(mpg ~ wt + cyl, data = mtcars)
> model2 <- linearRidge(mpg ~ wt + cyl, data = mtcars, lambda=0)
> expect_true(inherits(m2 <- margins(model2), "margins"), label = "margins works for linearRidge")
> m1 <- margins(model1)
> s1 <- summary(m1)
> s2 <- summary(m2)
> # Currently fails:
>     expect_equal(s1, s2)
Error: `s1` not equal to `s2`.
Attributes: < Component “call”: target, current do not match when deparsed >
Component “SE”: Mean relative difference: 1
Component “z”: Mean relative difference: Inf
Component “p”: Mean relative difference: 1
Component “lower”: Mean relative difference: 0.3282719
Component “upper”: Mean relative difference: 0.9557901
> s1
 factor     AME     SE       z      p   lower   upper
    cyl -1.5078 0.4147 -3.6360 0.0003 -2.3206 -0.6950
     wt -3.1910 0.7569 -4.2158 0.0000 -4.6745 -1.7075
> s2
 factor     AME     SE    z      p   lower   upper
    cyl -1.5078 0.0000 -Inf 0.0000 -1.5078 -1.5078
     wt -3.1910 0.0000 -Inf 0.0000 -3.1910 -3.1910

Clearly computing the standard error is not going well for my ridgeLinear object.

It is entirely possible that I've modified the ridgeLinear object to have some bug, or that I've incorrectly implemented the interface for it in the "prediction" library. The place I've traced to so far (the "FUN" gradient function in margins) seems to perhaps use the parent environment for some of its values. I could have a scoping problem. I'm looking for suggestions of things to try.

Within my "margins", I traced it with this code in the RStudio debugger:

source('R/jacobian.R')
source('R/gradient_factory.R')
source('R/find_terms_in_model.R')
source('R/get_effect_variances.R')
source('R/reset_coefs.R')
vars1 <- get_effect_variances(mtcars, model1)
vars2 <- get_effect_variances(mtcars, model2)

The jacobian is always all 0s for model2 (the linearRidge model), so perhaps the world looks flat and it doesn't find the variances properly.

I can keep dropping into the debugging rathole by diving into the marginal_effects within the gradient function FUN ("me_tmp <- gr.." near line 20 of gradient_factory.R), but I wondered if you had any ideas.

Thanks for your attention.

Output of sessionInfo:

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] margins_0.3.23 testthat_2.3.2 ridge_2.5     

loaded via a namespace (and not attached):
[1] MASS_7.3-51.5     compiler_3.6.3    magrittr_1.5      R6_2.4.1          tools_3.6.3      
[6] yaml_2.2.1        data.table_1.12.8 rlang_0.4.5       prediction_0.3.16

I'll attach my logs so far, why not. It's a bit opaque, tho.

Below are the log statements in diff form, and I attach two files:

1.log from get_effect_variances(mtcars, model1)
2.log from get_effect_variances(mtcars, model2)

The first obvious difference is i=2.

1.log has numer=9.9e-08 and out=0.99 (long line goes off the screen):

*** i 2 coeftemp 39.6862614802529 FUN -3.1909720389839 F0 -3.1909721389834 numer 9.99994953509997e-08 out[, i] 0.999994953509997

2.log has numer=0 and out=0:

*** i 2 coeftemp 39.686261480253 FUN -3.19097213898376 F0 -3.19097213898376 numer 0 out[, i] 0

It's as if FUN=F0 for some reason I can't quite see in 2.log.

Log statements:

diff --git a/R/gradient_factory.R b/R/gradient_factory.R
index 4238150..c601512 100644
--- a/R/gradient_factory.R
+++ b/R/gradient_factory.R
@@ -33,6 +34,7 @@ gradient_factory.default <- function(data, model, variables = NULL, type = "resp
             # apply colMeans to get average marginal effects
             means <- apply(me_tmp, 2L, stats::weighted.mean, w = weights, na.rm = TRUE)
         }
+        write(paste("** FUN me_tmp", me_tmp, "means", means), file="the.log", append=TRUE)
         means
     }
     return(FUN)
diff --git a/R/jacobian.R b/R/jacobian.R
index 2ac79aa..1298cb9 100644
--- a/R/jacobian.R
+++ b/R/jacobian.R
@@ -1,5 +1,7 @@
 # function returns a Jacobian matrix (a term-by-beta matrix)
 jacobian <- function(FUN, coefficients, weights = NULL, eps = 1e-7) {
+  write(paste("** jacob coeffs", coefficients, "weights", weights, "eps", eps),
+        file="the.log", append=TRUE)
     F0 <- FUN(coefficients, weights = weights)
     out <- matrix(NA_real_, nrow = length(F0), ncol = length(coefficients))
     colnames(out) <- names(coefficients)
@@ -8,6 +10,12 @@ jacobian <- function(FUN, coefficients, weights = NULL, eps = 1e-7) {
         coeftemp <- coefficients
         coeftemp[i] <- coeftemp[i] + eps
         out[, i] <- (FUN(coeftemp, weights = weights) - F0) / eps
+        write(paste("*** i", i, "coeftemp", coeftemp,
+                    "FUN", FUN(coeftemp, weights = weights),
+                    "F0", F0,
+                    "numer", (FUN(coeftemp, weights = weights) - F0),
+                    "out[, i]", out[, i]),
+              file="the.log", append=TRUE)
     }
     out
 }
diff --git a/tests/testthat/tests-core.R b/tests/testthat/tests-core.R
index a58f26f..997fba2 100644
@@ -65,6 +65,16 @@ if (requireNamespace("ridge")) {
     model2 <- linearRidge(mpg ~ wt + cyl, data = mtcars, lambda=0)
     expect_true(inherits(m2 <- margins(model2), "margins"), label = "margins works for linearRidge")

+    # TODO: remove
+    source('R/jacobian.R')
+    source('R/gradient_factory.R')
+    source('R/find_terms_in_model.R')
+    source('R/get_effect_variances.R')
+    source('R/reset_coefs.R')
+    vars1 <- get_effect_variances(mtcars, model1)
+    vars2 <- get_effect_variances(mtcars, model2)
+
+    # margins(model2)
     m1 <- margins(model1)
     s1 <- summary(m1)
     s2 <- summary(m2)

1.log 2.log

I found a possible factor.

reset_coefs changes the model coefficients.

For lm, this is simply setting a named vector.

For ridgeLinear, it's not entirely obvious how to do it. It doesn't look set up to do it easily. I implemented nothing, so it was using the default method, which doesn't look right. The coef.ridgeLinear function returns a vector where the intercept is constructed out of three different pieces:

inter <- object$ym - scaledcoef %*% object$xm

So perhaps the margins code is changing the model values in a way that currently depends on information carried around with the model. This would have to be adapted to ridgeLinear, but exactly what is required here is not obvious to me.

Thanks for all your work - I'll try to figure out what needs to happen.

FYI, if something happened here, I think we'd use it.

We use margins to analyze results with heterogeneous treatment effects, and wish we had some reasonable shrinkage to potentially reduce false positives.

We are thinking about other ways to approach het effects as well.

leeper / margins

Suggestions for "margins" of ridge regression? #140