ncn-foreigners / nonprobsvy

An R package for modern methods for non-probability surveys
https://ncn-foreigners.github.io/nonprobsvy/
Other
29 stars 4 forks source link

Predictive mean matching -- `summary` and `bootstrap` #19

Closed BERENZ closed 11 months ago

BERENZ commented 12 months ago

Problem: summary does not work correctly for pmm and bootstrap returns different point estimates than non-bootstrap approach.

Code:

library(data.table)
library(nonprobsvy)
set.seed(2023-10-19)
n_reps <- 50
N <- 50000
n <- 1000
x1 <- rnorm(N,1,1)
x2 <- rexp(N,1)
alp <- rnorm(N)
epsilon <- rnorm(N)
y11 <- 1 + x1 + x2 + alp + epsilon
y12 <- 0.5*(x1-1.5)^2 + x2^2 + alp + epsilon
y21 <- rbinom(N,1,plogis(1 + x1 + x2 + alp))
y22 <- rbinom(N,1,plogis(0.5*(x1-1.5)^2 + x2^2 + alp))
p1 <- plogis(x2)
p2 <- plogis(-3+(x1-1.5)^2+(x2-2)^2)
pop_data <- data.frame(x1,x2,y11,y12,y21,y22,p1,p2) |> setDT()

sample_prob <- pop_data[sample(1:N, n),]
sample_prob$w <- N/n
sample_prob_svy <- svydesign(ids=~1, weights = ~w, data = sample_prob)
sample_bd1 <- pop_data[rbinom(N,1,pop_data$p1)==1, ]
sample_bd1$w_naive <- N/nrow(sample_bd1)
sample_bd2 <- pop_data[rbinom(N,1,pop_data$p2)==1, ]
sample_bd2$w_naive <- N/nrow(sample_bd2)

m1 <- nonprob(outcome = y11 ~ x1 + x2, data=sample_bd2, svydesign=sample_prob_svy, method_outcome = "pmm")
m2 <- nonprob(outcome = y11 ~ x1 + x2, data=sample_bd2, svydesign=sample_prob_svy, method_outcome = "pmm", 
             control_inference = controlInf(var_method = "bootstrap", num_boot = 20))

Errors regarding summary

> summary(m1)
Error in if (attr(k, "glm")) { : argument is of length zero

> summary(m2)
Error in if (attr(k, "glm")) { : argument is of length zero

Potential errors with bootstrap


> m1$output
        mean         SE
y11 2.912592 0.04656599
> m2$output
        mean         SE
y11 2.592961 0.03303442
> mean(sample_bd2$y11)
[1] 2.592326
LukaszChrostowski commented 11 months ago

done