IQSS / Zelig

A statistical framework that serves as a common interface to a large range of models
http://zeligproject.org
109 stars 43 forks source link

Problem extracting `att.ev` #243

Closed christophergandrud closed 7 years ago

christophergandrud commented 7 years ago

From @jefunes on https://github.com/IQSS/Zelig/issues/220:

I am trying to obtain att.ev from sim output but i keep getting an error, please see code below:

z.m_b1 <- zelig(log(wy_yield_plot+1) ~ hhead_age + hh_education_mean + xx55 + w_c + perc_own_title + bean_area + dem, data=out_data_m_b1_control,model="ls")

x.out1 <- setx(z.m_b1,data=match.data(m_b1="treat"),cond=TRUE)

s.out1 <- sim(z.m_b1, x=x.out1)

Here i tried different ways to obtain att.ev values from sim output:

s.out1$qi$att.ev
## Error in s.out1$qi$att.ev : object of type 'closure' is not subsettable

then i tried

s.out1$qi$att.ev
## Error: unexpected input in "s.out1"

Please advice, Thanks so much for your support.

christophergandrud commented 7 years ago

To help me explore this issue, @jefunes could you provide a minimal data set for reproduce the problem?

Preliminary thought: it looks like you are using old Zelig < 5 syntax and may want to use the ATT method. E.g. something like:

z.m_b1$ATT(treatment = "xx")
out <- z.m_b1$get_qi(qi = "ATT", xvalue = "TE")

Filling in the treatment and xvalue as relevant

jefunes commented 7 years ago

sim.zip

jefunes commented 7 years ago

Thanks so much for your quick reply. I will try the code you suggested. I attached a dataset. regards

jefunes commented 7 years ago

5.0.17 Zelig version i am using

christophergandrud commented 7 years ago

Can you send this data in a plain text format like CSV?

jefunes commented 7 years ago

sample code data.zip

jefunes commented 7 years ago

i sent you sample code and data. thanks so much.

christophergandrud commented 7 years ago

I'm sorry, but that is not a comma separated values formatted file (you can validate your CSV file with a tool like: https://csvlint.io/). Though I don't need it necessarily in CSV, I just need to be able to load it into R exactly as you do. So, can you please send:

(a) exactly the code minimal code necessary you used to reach the issue, including loading the data set and necessary packages

(b) the minimal data set that is loadable by the code in (a) you used to reach the issue.

That will really help me address this.

Also, have you tried the new syntax yet to see if that (or a close variation) works?

Thanks!

jefunes commented 7 years ago

I tried the new syntax but did not work. Please see the new file and code. Thanks so much. sample.zip

jefunes commented 7 years ago

it should work, i checked the csv file in the suggested tool.

christophergandrud commented 7 years ago

Thanks. This seems to be related to a larger issue with the methods (https://github.com/IQSS/Zelig/issues/245) that will require some work to untangle.

Could you give me more background on this data and what you are trying to achieve?

jefunes commented 7 years ago

Sorry for my late reply. It relates to agricultural technology effect on crop yields from a cross sectional dataset. I aim to distinguish between a control and a treatment group to compare yields. Please let me know if you have questions. Thanks.

christophergandrud commented 7 years ago

Sorry for the late reply (we've been working on some large updates). In Zelig version 5+ you probably want to do something like:

library(CBPS)
library(Matchit)
library(foreign)
library(Zelig)

psm_bush2 <- read.csv("/sample/psm_bush2.csv")

# Log transform before estimation, needed for setx
psm_bush2$log_wy_yield <- log(psm_bush2$wy_yield_plot+1) 

# remove observations with missing values
psm_bush2 <- psm_bush2[complete.cases(psm_bush2), ]

test_match <- matchit(HIB_adoption ~ hhead_age + hh_education_mean + xx55 + 
                        w_c + perc_own_title + bean_area + dem, 
                      method="nearest", data=psm_bush2, replace=TRUE)

# Estimate model with all matched data, the treatment group will be identified in ATT below
z.out <- zelig(log_wy_yield ~ HIB_adoption + hhead_age + 
                                hh_education_mean + xx55 + w_c + 
                                perc_own_title + bean_area + dem, 
                              data= match.data(test_match), model="ls")

# average treatment effect on the treated
z.out <- ATT(z.out, treatment = "HIB_adoption")
zatt <- get_qi(z.out, qi = "ATT", xvalue = "TE")

Hope that works. Just let me know if you have any other issues. I'll be updating the MatchIt docs to reflect this new workflow as well.

Jason5Gao commented 7 years ago

Hi christophergandrud I followed your code above, yet can't find function ATT, which package is it in anyway.

further, using the demo(analysis), I encountered same error as @jefunes when it comes to this line

ate.all <- c(s.out1$qi$att.ev, -s.out2$qi$att.ev) from here

my OS _
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 2.5
year 2016
month 04
day 14
svn rev 70478
language R
version.string R version 3.2.5 (2016-04-14) nickname Very, Very Secure Dishes

and using zelig 5.0-17

christophergandrud commented 7 years ago

Sorry. I forgot to mention that you'll need the newest version of Zelig. It's still working its way through CRAN, but you can download it with:


devtools::install_github('IQSS/Zelig')
christophergandrud commented 7 years ago

I've also removed the demo as it is out of date. That syntax is no longer supported at all.

RobertFeyerharm commented 7 years ago

I encountered the same error even after loading the most recent version (5.1.2) of Zelig.

An alternative solution advocated by Ho, Imai, King, Stuart on page 4 of the MatchIt documentation (located at http://r.iq.harvard.edu/docs/matchit/2.4-20/matchit.pdf) is to estimate the ATE using the parametric model applied to the matched data (described on page 12 of the same manual) setting the explanatory variables to their means and the treatment variable to 0 or 1:

m.out <- matchit(treat ~ age + educ + black + hispan + nodegree + married + re74 + re75, method = "nearest", data = lalonde)
z.out <- zelig(re78 ~ treat + age + educ + black + hispan + nodegree + married + re74 + re75, data = match.data(m.out), model = "ls")
x.out <- setx(z.out, treat=0)
x1.out <- setx(z.out, treat=1)
s.out <- Zelig:::sim(z.out, x = x.out, x1 = x1.out)
summary(s.out)
 sim x :
 -----
ev
      mean      sd      50%     2.5%    97.5%
1 5215.039 519.958 5211.513 4209.118 6212.532
pv
         mean       sd      50%      2.5%    97.5%
[1,] 4937.875 6950.108 4871.602 -8896.022 18854.36

 sim x1 :
 -----
ev
      mean       sd      50%     2.5%    97.5%
1 6548.237 542.2629 6546.104 5548.695 7618.993
pv
         mean     sd      50%      2.5%    97.5%
[1,] 6579.165 7054.5 6666.385 -7288.168 20001.87
fd
      mean       sd      50%      2.5%    97.5%
1 1333.198 800.5099 1371.253 -272.1439 2932.168

Use the fd (= first difference) estimate of 1333.198, 95% CI = (-272.1439, 2932.168) as your ATE estimate.