manncz / dRCT

Design-based analysis of RCTs
1 stars 0 forks source link

formatting and missing values in pair vector for p-loop #3

Closed manncz closed 1 year ago

manncz commented 1 year ago

Formatting the "Pair" vector for p_loop:

Background: I am using p_loop() to estimate effects from a pair-matching design from an observational study, using the pairmatch() function from the optmatch package. That function returns a vector of pair IDs as a factor, where unmatched units have <NA> for their ID. e.g.:

> head(match)
   1    2    3    4    5    6 
<NA>  1.1 <NA> <NA> <NA> <NA> 

Plugging this into p_loop() led to two problems, one small and one serious. The small problem was that the P argument in p_loop() only takes numeric vectors. So:

> p_loop(Y=dat$y,Tr=dat$z,P=match,Z=cbind(prog))
Error in Z1 - Z2 : non-numeric argument to binary operator

I would think it would be fairly easy to allow for factors or character vectors, too. If not, the documentation should specify that P must be numeric.

More serious: The final output from p_loop() uses the full vectors of Y and Tr to estimate effects, including those cases with missing P, and throws an uninterpretable warning:

> p_loop(Y=dat$y,Tr=dat$z,P=as.numeric(match),Z=cbind(prog))
[1] 4.447821 2.663642
Warning message:
In Y[Tr == 1] - Y[Tr == 0] :
  longer object length is not a multiple of shorter object length

The problem comes in this line of p_loop:

tauhat = mean(Y[Tr == 1] - Y[Tr == 0]) - mean((2 * assigned$Tr - 1) * d)

If I exclude the unmatched cases at the outset I get the right answer (with no warning):

> p_loop(Y=dat$y[!is.na(match)],Tr=dat$z[!is.na(match)],P=as.numeric(match[!is.na(match)]),Z=cbind(prog[!is.na(match)]))
[1] 3.675760 2.717145

IMO the best solution would be to change the tauhat = line above to something like:

tauhat = with(assigned, mean((2*Tr - 1)*(Y1-Y2-d)))

since it uses assigned for the whole thing. An alternative would be to throw an error if there's an NA in P, or at least an informative warning.

Originally posted by @adamSales in https://github.com/adamSales/rebarLoop/issues/3#issuecomment-1483181649

manncz commented 1 year ago

c97c0fe addresses this issue