jmpsteen / medflex

Flexible mediation analysis using natural effect models in R
23 stars 6 forks source link

Problem with imputations (neImpute) and weights (neWeight) when fitting working models via vglm-function from VGAM package #4

Closed jmpsteen closed 8 years ago

jmpsteen commented 8 years ago

Dispatching predict for vglm fits (VGAM package) no longer works appropriately since the function predict.vglmis replaced by predictvglm. It still works for vgamfits though.

Because of this, the example below

library(medflex)
library(VGAM)
UPBdata$negaffcat <- cut(UPBdata$negaff, 3)
expData <- neWeight(negaffcat ~ att + gender + educ + age,
                    family = multinomial, data = UPBdata, FUN = vglm)

returns the error

Error in pred[i, as.character(x[i])] : subscript out of bounds

Thanks to Theis Lange for bringing this up.

Similarly, expData1 and expData2 in the code below should have the same regression weights.

UPBdata$negaffcount <- as.numeric(UPBdata$negaffcat)

expData1 <- neWeight(negaffcount ~ att + gender + educ + age,
                    family = poisson, data = UPBdata)

expData2 <- neWeight(negaffcount ~ att + gender + educ + age,
                    family = poissonff, data = UPBdata, FUN = vglm)

Because of the same reason the first code returns an error, the weights are not the same.

> head(weights(expData1))
[1] 0.8191681 0.9073526 0.9494244 0.9830567 1.0158802 1.0318637
> head(weights(expData2))
[1] 0.1645578 0.4492048 0.6454604 0.8549090 1.2328369 0.5956213

In addition, expData1, expData2 and expData3 in the code below should be the same. However, because of incorrect dispatching, when using vglm, predictions on the logit instead of probability scale are returned.

> expData1 <- neImpute(UPB ~ att + negaff + gender + educ + age,
+                     family = binomial, data = UPBdata)
> head(expData1)
  id          att0       att1 attbin attcat    negaff initiator gender educ age       UPB
1  1 -1.644093e+00  1.0005617      1      M  0.840461    myself      F    M  41 0.3091871
2  1 -5.974787e-01  1.0005617      1      M  0.840461    myself      F    M  41 0.3835087
3  1  8.023932e-06  1.0005617      1      M  0.840461    myself      F    M  41 0.4288055
4  1  5.974948e-01  1.0005617      1      M  0.840461    myself      F    M  41 0.4753273
5  1  1.644109e+00  1.0005617      1      M  0.840461    myself      F    M  41 0.5573637
6  2 -1.662279e+00 -0.7085889      0      L -1.257465      both      M    M  42 0.1490191
     negaffbin     negaffcat negaffcount
1  (0.84,3.05]  (0.104,1.58]           2
2  (0.84,3.05]  (0.104,1.58]           2
3  (0.84,3.05]  (0.104,1.58]           2
4  (0.84,3.05]  (0.104,1.58]           2
5  (0.84,3.05]  (0.104,1.58]           2
6 (-1.37,0.84] (-1.37,0.104]           1
> 
> expData2 <- neImpute(UPB ~ att + negaff + gender + educ + age,
+                      family = binomialff, data = UPBdata, FUN = vgam)
> head(expData2)
  id          att0       att1 attbin attcat    negaff initiator gender educ age       UPB
1  1 -1.644093e+00  1.0005617      1      M  0.840461    myself      F    M  41 0.3091871
2  1 -5.974787e-01  1.0005617      1      M  0.840461    myself      F    M  41 0.3835087
3  1  8.023932e-06  1.0005617      1      M  0.840461    myself      F    M  41 0.4288055
4  1  5.974948e-01  1.0005617      1      M  0.840461    myself      F    M  41 0.4753273
5  1  1.644109e+00  1.0005617      1      M  0.840461    myself      F    M  41 0.5573637
6  2 -1.662279e+00 -0.7085889      0      L -1.257465      both      M    M  42 0.1490191
     negaffbin     negaffcat negaffcount
1  (0.84,3.05]  (0.104,1.58]           2
2  (0.84,3.05]  (0.104,1.58]           2
3  (0.84,3.05]  (0.104,1.58]           2
4  (0.84,3.05]  (0.104,1.58]           2
5  (0.84,3.05]  (0.104,1.58]           2
6 (-1.37,0.84] (-1.37,0.104]           1
>
> expData3 <- neImpute(UPB ~ att + negaff + gender + educ + age,
+                      family = binomialff, data = UPBdata, FUN = vglm)
> head(expData3)
  id          att0       att1 attbin attcat    negaff initiator gender educ age         UPB
1  1 -1.644093e+00  1.0005617      1      M  0.840461    myself      F    M  41 -0.80392224
2  1 -5.974787e-01  1.0005617      1      M  0.840461    myself      F    M  41 -0.47468177
3  1  8.023932e-06  1.0005617      1      M  0.840461    myself      F    M  41 -0.28672643
4  1  5.974948e-01  1.0005617      1      M  0.840461    myself      F    M  41 -0.09877108
5  1  1.644109e+00  1.0005617      1      M  0.840461    myself      F    M  41  0.23046938
6  2 -1.662279e+00 -0.7085889      0      L -1.257465      both      M    M  42 -1.74231513
     negaffbin     negaffcat negaffcount
1  (0.84,3.05]  (0.104,1.58]           2
2  (0.84,3.05]  (0.104,1.58]           2
3  (0.84,3.05]  (0.104,1.58]           2
4  (0.84,3.05]  (0.104,1.58]           2
5  (0.84,3.05]  (0.104,1.58]           2
6 (-1.37,0.84] (-1.37,0.104]           1

Finally, neMod1 and neMod2 should have the same coefficients. Because of incorrect dispatching in medflex:::neModelEst, the wrong weights are used for neMod2 (cf previous example).

> expData <- neImpute(UPB ~ attbin * negaff + gender + educ + age,
+                     family = binomial, data = UPBdata)
> 
> expFit1 <- glm(attbin ~ gender + educ + age, family = binomial, data = UPBdata)
> expFit2 <- vglm(attbin ~ gender + educ + age, family = binomialff, data = UPBdata)
> 
> neMod1 <- neModel(UPB ~ attbin0 * attbin1, family = binomial,
+                   expData = expData, xFit = expFit1, nBoot = 10)
> neMod2 <- neModel(UPB ~ attbin0 * attbin1, family = binomial,
+                   expData = expData, xFit = expFit2, nBoot = 10)
There were 12 warnings (use warnings() to see them)
> 
> coef(neMod1)
    (Intercept)         attbin0         attbin1 attbin0:attbin1 
     -0.9367622       0.3838127       0.3026838       0.0542307 
> coef(neMod2)
    (Intercept)         attbin0         attbin1 attbin0:attbin1 
    -0.92907407      0.37337514      0.31289696      0.07854928 
jmpsteen commented 8 years ago

Patched by replacing predict by predictFUN in functions neImpute.default, neWeight.default and medflex:::neModelEst and defining predictFUN according to the class of the fit object, as below:

predictFUN <- if (inherits(fit, "vglm")) 
      VGAM::predictvglm
    else if (inherits(fit, "vgam")) 
      VGAM::predict.vgam
    else predict