IQSS / Zelig

A statistical framework that serves as a common interface to a large range of models
http://zeligproject.org
110 stars 43 forks source link

Normal Survey Regression Example 1 has inconsistent results #273

Closed mbsabath closed 7 years ago

mbsabath commented 7 years ago
Zelig version: 5.1.1
MacOS 10.12.5
R: 3.4.0

Results from the Normal Survey Regression example 1 on my machine don't match the results in the vignette. The results are independent of the random seed used. My personal theory as to the cause of this is that there is a vector of weights mentioned in the code, as well as in the text of the vignette, that is not created or included as part of the data set up code.

Published Results:

Model: 

 Call:
 z5$zelig(formula = api00 ~ meals + yr.rnd, data = apistrat, weights = ~pw)

 Survey design:
 survey::svydesign(data = data, ids = ids, probs = probs, strata = strata, 
     fpc = fpc, nest = nest, check.strata = check.strata, weights = localWeights)

 Coefficients:
             Estimate Std. Error t value Pr(>|t|)
 (Intercept) 846.0409     8.9836  94.176   <2e-16
 meals        -3.4882     0.1678 -20.783   <2e-16
 yr.rndYes   -15.4566    15.6413  -0.988    0.324

 (Dispersion parameter for gaussian family taken to be 155138.3)

 Number of Fisher Scoring iterations: 2 

 Next step: Use 'setx' method

My Results:

Model: 

Call:
z5$zelig(formula = api00 ~ meals + yr.rnd, data = apistrat, weights = ~pw)

Survey design:
survey::svydesign(data = data, ids = ids, probs = probs, strata = strata, 
    fpc = fpc, nest = nest, check.strata = check.strata, weights = localWeights)

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept) 825.1058     9.3912  87.859   <2e-16
meals        -3.3581     0.1698 -19.781   <2e-16
yr.rndYes    -6.3855    15.4044  -0.415    0.679

(Dispersion parameter for gaussian family taken to be 5225.087)

Number of Fisher Scoring iterations: 2

Next step: Use 'setx' method
christophergandrud commented 7 years ago

Thanks for reporting this @mbsabath. I'll take a look.

christophergandrud commented 7 years ago

Actually, the more I think about it, and in conjunction with thoughts from @cchoirat and @izahn, I think we should deprecate Zelig survey and replace it with a vignette on using the survey package with to_zelig.

Consistently and accurately passing all of the possible survey arguments and design objects has been difficult and hard to test.

The relevant parts of the actual output from surveyglm is I think just a usual glm fitted model object.

Thoughts?

christophergandrud commented 7 years ago

Rather than completely deprecating zelig survey in an upcoming release maybe we should have it:

christophergandrud commented 7 years ago

Implemented in 21171f7ef036db294142f18ebc7924850c144bdc

I wonder if the warning's wording should be stronger. Currently:

Warning message:
Not all features are available in Zelig Survey.
Consider using surveyglm and setx directly.
For details see: <http://docs.zeligproject.org/articles/to_zelig.html>. 

https://github.com/IQSS/Zelig/commit/21171f7ef036db294142f18ebc7924850c144bdc#diff-2f97fa35a7a84d0f068f3e1c1f31c742R28

christophergandrud commented 7 years ago

Merge into master and ready for CRAN release