leeper / margins

An R Port of Stata's 'margins' Command
https://cloud.r-project.org/package=margins
Other
263 stars 40 forks source link

Suggestions for "margins" of ridge regression? #140

Open dfrankow opened 4 years ago

dfrankow commented 4 years ago

This is a request for debugging assistance for a new feature, margins of ridge regression. If I can debug this, I'd be happy if you'd accept a pull request with the feature.

I am trying to modify your "predictions" and "margins" libraries to find the marginal effects of a ridge regression.

The test and output look like this:

> model1 <- lm(mpg ~ wt + cyl, data = mtcars)
> model2 <- linearRidge(mpg ~ wt + cyl, data = mtcars, lambda=0)
> expect_true(inherits(m2 <- margins(model2), "margins"), label = "margins works for linearRidge")
> m1 <- margins(model1)
> s1 <- summary(m1)
> s2 <- summary(m2)
> # Currently fails:
>     expect_equal(s1, s2)
Error: `s1` not equal to `s2`.
Attributes: < Component “call”: target, current do not match when deparsed >
Component “SE”: Mean relative difference: 1
Component “z”: Mean relative difference: Inf
Component “p”: Mean relative difference: 1
Component “lower”: Mean relative difference: 0.3282719
Component “upper”: Mean relative difference: 0.9557901
> s1
 factor     AME     SE       z      p   lower   upper
    cyl -1.5078 0.4147 -3.6360 0.0003 -2.3206 -0.6950
     wt -3.1910 0.7569 -4.2158 0.0000 -4.6745 -1.7075
> s2
 factor     AME     SE    z      p   lower   upper
    cyl -1.5078 0.0000 -Inf 0.0000 -1.5078 -1.5078
     wt -3.1910 0.0000 -Inf 0.0000 -3.1910 -3.1910

Clearly computing the standard error is not going well for my ridgeLinear object.

It is entirely possible that I've modified the ridgeLinear object to have some bug, or that I've incorrectly implemented the interface for it in the "prediction" library. The place I've traced to so far (the "FUN" gradient function in margins) seems to perhaps use the parent environment for some of its values. I could have a scoping problem. I'm looking for suggestions of things to try.

Within my "margins", I traced it with this code in the RStudio debugger:

source('R/jacobian.R')
source('R/gradient_factory.R')
source('R/find_terms_in_model.R')
source('R/get_effect_variances.R')
source('R/reset_coefs.R')
vars1 <- get_effect_variances(mtcars, model1)
vars2 <- get_effect_variances(mtcars, model2)

The jacobian is always all 0s for model2 (the linearRidge model), so perhaps the world looks flat and it doesn't find the variances properly.

I can keep dropping into the debugging rathole by diving into the marginal_effects within the gradient function FUN ("me_tmp <- gr.." near line 20 of gradient_factory.R), but I wondered if you had any ideas.

Thanks for your attention.

Output of sessionInfo:

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] margins_0.3.23 testthat_2.3.2 ridge_2.5     

loaded via a namespace (and not attached):
[1] MASS_7.3-51.5     compiler_3.6.3    magrittr_1.5      R6_2.4.1          tools_3.6.3      
[6] yaml_2.2.1        data.table_1.12.8 rlang_0.4.5       prediction_0.3.16
dfrankow commented 4 years ago

I'll attach my logs so far, why not. It's a bit opaque, tho.

Below are the log statements in diff form, and I attach two files:

The first obvious difference is i=2.

1.log has numer=9.9e-08 and out=0.99 (long line goes off the screen):

*** i 2 coeftemp 39.6862614802529 FUN -3.1909720389839 F0 -3.1909721389834 numer 9.99994953509997e-08 out[, i] 0.999994953509997

2.log has numer=0 and out=0:

*** i 2 coeftemp 39.686261480253 FUN -3.19097213898376 F0 -3.19097213898376 numer 0 out[, i] 0

It's as if FUN=F0 for some reason I can't quite see in 2.log.

Log statements:

diff --git a/R/gradient_factory.R b/R/gradient_factory.R
index 4238150..c601512 100644
--- a/R/gradient_factory.R
+++ b/R/gradient_factory.R
@@ -33,6 +34,7 @@ gradient_factory.default <- function(data, model, variables = NULL, type = "resp
             # apply colMeans to get average marginal effects
             means <- apply(me_tmp, 2L, stats::weighted.mean, w = weights, na.rm = TRUE)
         }
+        write(paste("** FUN me_tmp", me_tmp, "means", means), file="the.log", append=TRUE)
         means
     }
     return(FUN)
diff --git a/R/jacobian.R b/R/jacobian.R
index 2ac79aa..1298cb9 100644
--- a/R/jacobian.R
+++ b/R/jacobian.R
@@ -1,5 +1,7 @@
 # function returns a Jacobian matrix (a term-by-beta matrix)
 jacobian <- function(FUN, coefficients, weights = NULL, eps = 1e-7) {
+  write(paste("** jacob coeffs", coefficients, "weights", weights, "eps", eps),
+        file="the.log", append=TRUE)
     F0 <- FUN(coefficients, weights = weights)
     out <- matrix(NA_real_, nrow = length(F0), ncol = length(coefficients))
     colnames(out) <- names(coefficients)
@@ -8,6 +10,12 @@ jacobian <- function(FUN, coefficients, weights = NULL, eps = 1e-7) {
         coeftemp <- coefficients
         coeftemp[i] <- coeftemp[i] + eps
         out[, i] <- (FUN(coeftemp, weights = weights) - F0) / eps
+        write(paste("*** i", i, "coeftemp", coeftemp,
+                    "FUN", FUN(coeftemp, weights = weights),
+                    "F0", F0,
+                    "numer", (FUN(coeftemp, weights = weights) - F0),
+                    "out[, i]", out[, i]),
+              file="the.log", append=TRUE)
     }
     out
 }
diff --git a/tests/testthat/tests-core.R b/tests/testthat/tests-core.R
index a58f26f..997fba2 100644
@@ -65,6 +65,16 @@ if (requireNamespace("ridge")) {
     model2 <- linearRidge(mpg ~ wt + cyl, data = mtcars, lambda=0)
     expect_true(inherits(m2 <- margins(model2), "margins"), label = "margins works for linearRidge")

+    # TODO: remove
+    source('R/jacobian.R')
+    source('R/gradient_factory.R')
+    source('R/find_terms_in_model.R')
+    source('R/get_effect_variances.R')
+    source('R/reset_coefs.R')
+    vars1 <- get_effect_variances(mtcars, model1)
+    vars2 <- get_effect_variances(mtcars, model2)
+
+    # margins(model2)
     m1 <- margins(model1)
     s1 <- summary(m1)
     s2 <- summary(m2)

1.log 2.log

dfrankow commented 4 years ago

I found a possible factor.

reset_coefs changes the model coefficients.

For lm, this is simply setting a named vector.

For ridgeLinear, it's not entirely obvious how to do it. It doesn't look set up to do it easily. I implemented nothing, so it was using the default method, which doesn't look right. The coef.ridgeLinear function returns a vector where the intercept is constructed out of three different pieces:

inter <- object$ym - scaledcoef %*% object$xm

So perhaps the margins code is changing the model values in a way that currently depends on information carried around with the model. This would have to be adapted to ridgeLinear, but exactly what is required here is not obvious to me.

leeper commented 4 years ago

Thanks for all your work - I'll try to figure out what needs to happen.

dfrankow commented 4 years ago

FYI, if something happened here, I think we'd use it.

We use margins to analyze results with heterogeneous treatment effects, and wish we had some reasonable shrinkage to potentially reduce false positives.

We are thinking about other ways to approach het effects as well.