Closed BERENZ closed 12 months ago
Furthermore, if I run the following code
y11_corr_one <- nonprob(selection = ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10,
target = ~ Y_11 + Y_12,
data = sample_B1,
svydesign = sample_A_svy_cal,
method_selection = "logit",
control_inference = controlInf(vars_selection = TRUE),
control_selection = controlSel(nfolds = 5, est_method_sel = "gee", h = 1),
verbose = T)
and then summary(y11_corr_one)
variable names are missing
> y11_corr_one
Call:
nonprob(data = sample_B1, selection = ~X1 + X2 + X3 + X4 + X5 +
X6 + X7 + X8 + X9 + X10, target = ~Y_11 + Y_12, svydesign = sample_A_svy_cal,
method_selection = "logit", control_selection = controlSel(nfolds = 5,
est_method_sel = "gee", h = 1), control_inference = controlInf(vars_selection = TRUE),
verbose = T)
Estimated population mean with overall std.err and confidence interval:
mean SE lower_bound upper_bound
Y_11 2.133006 0.2454810 1.651872 2.614140
Y_12 6.624884 0.4378084 5.766795 7.482973
> summary(y11_corr_one)
Call:
nonprob(data = sample_B1, selection = ~X1 + X2 + X3 + X4 + X5 +
X6 + X7 + X8 + X9 + X10, target = ~Y_11 + Y_12, svydesign = sample_A_svy_cal,
method_selection = "logit", control_selection = controlSel(nfolds = 5,
est_method_sel = "gee", h = 1), control_inference = controlInf(vars_selection = TRUE),
verbose = T)
-------------------------
Estimated population mean: 2.1336.625 with overall std.err of: 0.2455
And std.err for nonprobability and probability samples being respectively:
0.07868 and 0.4378
95% Confidence inverval for popualtion mean:
lower_bound upper_bound
Y_11 1.651872 2.614140
Y_12 5.766795 7.482973
Based on: Inverse probability weighted method
For a population of estimate size:
Obtained on a nonprobability sample of size: 2267
With an auxiliary probability sample of size: 510
-------------------------
Regression coefficients:
-----------------------
For glm regression on selection variable:
Estimate Std. Error z value P(>|z|)
[1,] -1.98191 0.01218 -162.67 <2e-16 ***
[2,] 1.03636 0.01191 87.01 <2e-16 ***
[3,] 1.13838 0.01312 86.75 <2e-16 ***
[4,] 0.98006 0.01214 80.76 <2e-16 ***
[5,] 1.01880 0.01160 87.83 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
-------------------------
Weights:
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.003 1.320 1.932 4.411 3.467 209.363
It seems that this is the problem with h=1
and h=2
.
DONE
Currently, the
nonprobSel
function applied to IPW estimator iterates overtarget
parameter, i.e. variable selection is applied as many times as the number of variables specified in thetarget
parameter. Here's an example