Open adamSales opened 1 year ago
Oh yeah, I also think there should be optional output like, say, R^2 from the random forests or whatever. we could go to town
Formatting the "Pair" vector for p_loop
:
Background: I am using p_loop()
to estimate effects from a pair-matching design from an observational study, using the pairmatch()
function from the optmatch
package. That function returns a vector of pair IDs as a factor, where unmatched units have <NA>
for their ID. e.g.:
> head(match)
1 2 3 4 5 6
<NA> 1.1 <NA> <NA> <NA> <NA>
Plugging this into p_loop()
led to two problems, one small and one serious. The small problem was that the P
argument in p_loop()
only takes numeric vectors. So:
> p_loop(Y=dat$y,Tr=dat$z,P=match,Z=cbind(prog))
Error in Z1 - Z2 : non-numeric argument to binary operator
I would think it would be fairly easy to allow for factors or character vectors, too. If not, the documentation should specify that P
must be numeric.
More serious:
The final output from p_loop()
uses the full vectors of Y and Tr to estimate effects, including those cases with missing P
, and throws an uninterpretable warning:
> p_loop(Y=dat$y,Tr=dat$z,P=as.numeric(match),Z=cbind(prog))
[1] 4.447821 2.663642
Warning message:
In Y[Tr == 1] - Y[Tr == 0] :
longer object length is not a multiple of shorter object length
The problem comes in this line of p_loop
:
tauhat = mean(Y[Tr == 1] - Y[Tr == 0]) - mean((2 * assigned$Tr - 1) * d)
If I exclude the unmatched cases at the outset I get the right answer (with no warning):
> p_loop(Y=dat$y[!is.na(match)],Tr=dat$z[!is.na(match)],P=as.numeric(match[!is.na(match)]),Z=cbind(prog[!is.na(match)]))
[1] 3.675760 2.717145
IMO the best solution would be to change the tauhat =
line above to something like:
tauhat = with(assigned, mean((2*Tr - 1)*(Y1-Y2-d)))
since it uses assigned
for the whole thing. An alternative would be to throw an error if there's an NA in P
, or at least an informative warning.
This seems like as good a place as any to collect issues related to loop.estimator @johanngb
Really these should be separate issues, but then again really loop.estimator should be its own repo (I am tempted to make one, but really @johanngb should own it I think)
Here are two suggestions: