JasjeetSekhon / Matching

23 stars 6 forks source link

Impossible to reconstruct the matched dataset #4

Open clandinq opened 4 months ago

clandinq commented 4 months ago

I ran into this package looking for an alternative to the package MatchIt, which cannot generate the matching estimator developed by Abadie and Imbens and therefore does not estimate the ATE for nearest neighbor matching. I have been trying to understand which observations from the original dataset are kept in the matched dataset. Two different methods have not worked out for me:

  1. Using the mdata object

    • The Matching manual describes that mdata is a list containing the matched datasets produced by Match. However, these vectors or matrices do not correspond to either the original sample size or the matched dataset size (e.g. obtained by looking at the dimension of the weights.
  2. Generating the matched dataset from index.treated and index.control.

    • The accompanying paper says that

      the index.control and index.treated indices which are in the object returned by Match are vectors containing the observation numbers from the original dataset for the treated (control) observations in the matched dataset. Both indices together can be used to construct the matched dataset. The matched dataset is also returned in the mdata object—see the Match manual page for details.

    • However, when adding these two vectors together and keeping unique values I get a value that is different from the size of the weights object. Then the matched dataset cannot be reconstructed.

Here is a minimal working example:

library("Matching")
library("dplyr")
data("lalonde")
Y <- lalonde$re78
Tr <- lalonde$treat
glm1 <- glm(Tr ~ age + educ + black + hisp + married + nodegr +    re74 + re75, family = binomial, data = lalonde)
rr1 <- Match(Y = Y, Tr = Tr, X = glm1$fitted)
c(rr1[["index.treated"]], rr1[["index.control"]]) %>% unique() %>% length()
JasjeetSekhon commented 4 months ago

In your example,

mat = cbind(rr1[["index.treated"]], rr1[["index.control"]], rr1$weights)

That is the matched dataset.

See the help page for the weights variable.

By default, one is matching with replacement (as in Abadie and Imbens).

On Tue, Jun 4, 2024 at 9:12 PM César Landín @.***> wrote:

I ran into this package looking for an alternative to the package MatchIt, which cannot generate the matching estimator developed by Abadie and Imbens and therefore does not estimate the ATE for nearest neighbor matching. I have been trying to understand which observations from the original dataset are kept in the matched dataset. Two different methods have not worked out for me:

1.

Using the mdata object

  • The Matching manual https://cran.r-project.org/web/packages/Matching/Matching.pdf describes that mdata is a list containing the matched datasets produced by Match. However, these vectors or matrices do not correspond to either the original sample size or the matched dataset size (e.g. obtained by looking at the dimension of the weights. 2.

    Generating the matched dataset from index.treated and index.control.

  • The accompanying paper https://www.jstatsoft.org/article/view/v042i07 says that

    the index.control and index.treated indices which are in the object returned by Match are vectors containing the observation numbers from the original dataset for the treated (control) observations in the matched dataset. Both indices together can be used to construct the matched dataset. The matched dataset is also returned in the mdata object—see the Match manual page for details.

    -

    However, when adding these two vectors together and keeping unique values I get a value that is different from the size of the weights object. Then the matched dataset cannot be reconstructed.

Here is a minimal working example:

library("Matching") library("dplyr") data("lalonde") Y <- lalonde$re78 Tr <- lalonde$treat glm1 <- glm(Tr ~ age + educ + black + hisp + married + nodegr + re74 + re75, family = binomial, data = lalonde) rr1 <- Match(Y = Y, Tr = Tr, X = glm1$fitted) c(rr1[["index.treated"]], rr1[["index.control"]]) %>% unique() %>% length()

— Reply to this email directly, view it on GitHub https://github.com/JasjeetSekhon/Matching/issues/4, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADHXHIOOYGEQAX2BOCN6ND3ZFZQXXAVCNFSM6AAAAABIZWQVXGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMZTINRYGY2DIOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>