markmfredrickson / optmatch

Functions for optimal matching in R
https://markmfredrickson.github.io/optmatch
Other
47 stars 14 forks source link

Warning in exactMatch #122

Closed jwbowers closed 7 years ago

jwbowers commented 7 years ago

On one particular analysis I'm running into this warning (from makedist). And I haven't figured it out. Any ideas? (Code to replicate below)

Code

library(optmatch)
load(url("http://jakebowers.org/Data/tmp.rda"))
mhDist <- match_on(mhfmla,within = exactMatch(a24~region+genderMale,data=psdat), data=psdat,method="rank_mahalanobis")

Output

R version 3.3.2 (2016-10-31) -- "Sincere Pumpkin Patch"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin13.4.0 (64-bit)

> library(optmatch)
Loading required package: survival
The optmatch package has an academic license. Enter relaxinfo() for more information.
> load(url("http://jakebowers.org/Data/tmp.rda"))
> mhDist <- match_on(mhfmla,within = exactMatch(a24~region+genderMale,data=psdat), data=psdat,method="rank_mahalanobis")
Warning message:
In replace(within, 1:length(within), dists) :
  number of items to replace is not a multiple of replacement length
> 
benthestatistician commented 7 years ago

I think this warning may be flagging a bug. It's being thrown at this line of makedist.R, i.e.

  res <- replace(within, 1:length(within), dists)

In the provided example, I get

Browse[1]> length(within)
[1] 4889857
Browse[1]> length(dists)
[1] 22887076
Browse[1]> length(dists)/length(within)
[1] 4.680521

so something's gone wrong. Barring an explanation as to why this should be innocuous, I'm flagging this as a bug.

Additional comments:

  1. Example is big, takes a minute or two to run. (Nice to have a provided example, though! Thanks @jwbowers .)
  2. the exactMatch() part of the example doesn't itself generate warnings.
  3. The last commit to touch this was a merge by @josherrickson, but I have the feeling that makedist was mostly in Mark's court. Could you both take a look, share your thoughts about what's supposed to be going on vs what is?
  4. I've heard of this warning occurring elsewhere in the wild; let's figure it out.
benthestatistician commented 7 years ago

Update: the problem is specific to the rank_mahalanobis method, which appears to be generating distances of incorrect length.

> em1 <- exactMatch(a24~region+genderMale,data=psdat)
> mhDist <- match_on(mhfmla, within=em1, data=psdat, method="rank_mahalanobis")
Warning message:
In replace(within, 1:length(within), dists) :
number of items to replace is not a multiple of replacement length
> mhDist <- match_on(mhfmla, within=em1, data=psdat, method="mahalanobis")
> 
benthestatistician commented 7 years ago

The resolution of the issue appears to be that this particular warning should have been an error. Hopefully to stop being an error soon -- see #128 . Closing.