leeper / margins

An R Port of Stata's 'margins' Command
https://cloud.r-project.org/package=margins
Other
260 stars 39 forks source link

Error when svyglm drops observations due to missingness #164

Open mattysimonson opened 3 years ago

mattysimonson commented 3 years ago

The margins() function returns an error when a svyglm object has omitted observations from the original data due to missingness. It appears to have trouble reconciling the number of rows in the original data with the number of rows actually used.

## load package
library("margins")

# Create a survey design using the survey package vignette
library(survey)
data(api)
dstrat <- svydesign(id=~1,strata=~stype,  data=apistrat, fpc=~fpc)
dclus2<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2)

# Run a regression
m1 <- svyglm(api00 ~ ell + meals + mobility, design = dclus2)

# So far so good, margins() works
margins(m1, design = dclus2)

# Now simulate what happens if values are missing
apiclus2_modified <- apiclus2
apiclus2_modified[1:10, "meals"] <- NA

# Create survey design and run regression
dclus2_modified<-svydesign(id=~dnum+snum, fpc=~fpc1+fpc2, data=apiclus2_modified)
m2 <- svyglm(api00 ~ ell + meals + mobility, design = dclus2_modified)

# margins() fails
margins(m2, design = dclus2_modified)

## session info for your system
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] survey_4.0      survival_3.1-12 Matrix_1.2-18   margins_0.3.26 

loaded via a namespace (and not attached):
 [1] MASS_7.3-51.4     compiler_3.6.1    DBI_1.1.0         tools_3.6.1      
 [5] splines_3.6.1     data.table_1.13.4 packrat_0.5.0     lattice_0.20-38  
 [9] mitools_2.4       prediction_0.3.14

# Error message:
Error in data.frame(..., check.rows = FALSE, check.names = FALSE, fix.empty.names = FALSE,  : 
  arguments imply differing number of rows: 252, 232

#Traceback: 
13: stop(gettextf("arguments imply differing number of rows: %s", 
        paste(unique(nrows), collapse = ", ")), domain = NA)
12: data.frame(..., check.rows = FALSE, check.names = FALSE, fix.empty.names = FALSE, 
        stringsAsFactors = FALSE)
11: make_data_frame(out, fitted = unclass(tmp), se.fitted = sqrt(unname(attributes(tmp)[["var"]])))
10: prediction.svyglm(model = model, data = data.table::rbindlist(list(d0, 
        d1)), type = type, calculate_se = FALSE, ...)
9: prediction(model = model, data = data.table::rbindlist(list(d0, 
       d1)), type = type, calculate_se = FALSE, ...)
8: dydx.default(X[[i]], ...)
7: FUN(X[[i]], ...)
6: lapply(c(varslist$nnames, varslist$lnames), dydx, data = data, 
       model = model, type = type, eps = eps, as.data.frame = as.data.frame, 
       ...)
5: marginal_effects.glm(model = model, data = data, variables = variables, 
       type = type, eps = eps, varslist = varslist, ...)
4: marginal_effects(model = model, data = data, variables = variables, 
       type = type, eps = eps, varslist = varslist, ...)
3: build_margins(model = model, data = data_list[[i]], variables = variables, 
       type = type, vcov = vcov, vce = vce, iterations = iterations, 
       unit_ses = unit_ses, weights = wts, eps = eps, varslist = varslist, 
       ...)
2: margins.svyglm(m2, design = dclus2_modified)
1: margins(m2, design = dclus2_modified)
tdharvey02 commented 3 years ago

Did this every get resolved? I am having the same problem-thanks!

tzoltak commented 3 years ago

It looks like my pull-request #159 solves this problem too.