Closed benthestatistician closed 5 years ago
Seeing the same test failure on a slightly older version of R + different platform, viz.
!> sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 3.4.3 (2017-11-30)
os Red Hat Enterprise Linux Workstation 7.6 (Maipo)
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2019-05-28
along w/ Rcpp version 1.0.1 .
I can't replicate this.
> summary(as.numeric(match_on(Z ~ X1 + X2 + B, method = "euclidean")))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.5724 2.1063 3.2750 3.3909 4.3627 7.0814
> summary(as.numeric(euclid))
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.5724 2.1063 3.2750 3.3909 4.3627 7.0814
> summary(as.numeric(match_on(Z ~ X1 + X2 + B, method = "euclidean") - euclid))
Min. 1st Qu. Median Mean 3rd Qu. Max.
-4.912e-07 -1.306e-08 3.954e-08 1.286e-07 2.650e-07 7.197e-07
This is all of version on CRAN, on master, and on issue54-hinting branch. R 3.6.0.
Odd. Something unusual about my setup? I see it on my desktop Mac, sessioninfo::session_info() ─ Session info ─────────────────────────────────────────────────────────────── setting value version R version 3.6.0 (2019-04-26) os macOS Mojave 10.14.5 system x86_64, darwin15.6.0 ui X11 language (EN) collate en_US.UTF-8 ctype en_US.UTF-8 tz America/Detroit date 2019-06-05
and also on my lunix box:
sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 3.4.3 (2017-11-30)
os Red Hat Enterprise Linux Workstation 7.6 (Maipo)
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2019-06-05
Do you have a package overwriting dist
? Do toy examples work?
d <- data.frame(z = c(0,1), x = rnorm(2))
library(optmatch)
all.equal(as.numeric(match_on(z ~ x, data = d, method = "euclidean")), as.numeric(dist(d$x)))
No and yes:
getAnywhere('dist')$where
[1] "package:stats" "namespace:stats"
d <- data.frame(z = c(0,1), x = rnorm(2))
library(optmatch)
all.equal(as.numeric(match_on(z ~ x, data = d, method = "euclidean")), as.numeric(dist(d$x)))
## [1] TRUE
I can confirm that I get the expected results on peirce.
Josh made the suggestion that I try this after starting w/ R --vanilla
. Findings:
--vanilla
. --vanilla
option, sessioninfo::session_info()
returns precisely the same things, on either machine. Mac version:─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 3.6.0 (2019-04-26)
os macOS Mojave 10.14.5
system x86_64, darwin15.6.0
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/Detroit
date 2019-06-12
─ Packages ───────────────────────────────────────────────────────────────────
package * version date lib source
abind 1.4-5 2016-07-21 [1] CRAN (R 3.6.0)
assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.0)
crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
digest 0.6.18 2018-10-10 [1] CRAN (R 3.6.0)
lattice 0.20-38 2018-11-04 [1] CRAN (R 3.6.0)
magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
Matrix 1.2-17 2019-03-22 [1] CRAN (R 3.6.0)
optmatch * 0.9-10 2018-07-12 [1] CRAN (R 3.6.0)
R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.0)
Rcpp 1.0.1 2019-03-17 [1] CRAN (R 3.6.0)
RItools 0.1-16 2018-06-19 [1] CRAN (R 3.6.0)
rlang 0.3.4 2019-04-07 [1] CRAN (R 3.6.0)
sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
SparseM 1.77 2017-04-23 [1] CRAN (R 3.6.0)
survival * 2.44-1.1 2019-04-01 [1] CRAN (R 3.6.0)
svd 0.4.3 2019-04-05 [1] CRAN (R 3.6.0)
testthat * 2.1.1 2019-04-23 [1] CRAN (R 3.6.0)
withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0)
xtable 1.8-4 2019-04-21 [1] CRAN (R 3.6.0)
[1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
(I'm not sure what this clue is telling us, but it seems a very good one. Thanks for the tip Josh!)
The next step I think would be to look into your .Rprofile.
not much of interest in my .Rprofile, as it happens:
## Print this on start so I know it's loaded.
cat("Loading $HOME/.Rprofile ...\n")
.First <- function() {
options(
repos = c(CRAN = "https://cran.case.edu/"),
digits=3L,
deparse.max.lines = 2L)
}
...and that's it.
Do you save your environment and reload it when starting R? Perhaps move to a directory where you've never launched R from and start it, see if the problem persists.
It looks like --vanilla sets 4 things: --no-site-file, --no-init-file, --no-environ and --no-restore. Try each of those and see which one(s) fix it to narrow down the issue.
The problem comes from my init file, specifically from the part of my
.First()
def setting the global "digits
" option to "3L
":
.First <- function() {
options(
repos = c(CRAN = "https://cran.case.edu/"),
digits=3L,
deparse.max.lines = 2L)
}
Surprisingly (to me), the digits option affects the calculations of
either of stats::dist()
and optmatch::match_on()
:
> all.equal(euclid_digits_set_to_3L, euclid_digits_not_set)
[1] "Mean relative difference: 0.676"
> all.equal(matchon_digits_set_to_3L, matchon_digits_not_set)
[1] "Mean relative difference: 0.518"
(When I put that line in my .Rprofile, I figured printing was the only thing that would be affected.)
No appetite for pinning it down any further, so I'll close out the issue after adjusting the tests a bit to avoid tripping this during development. (Unless you all complain, in which case I'll consider adjusting my .Rprofile instead.)
Thanks for another very nice troubleshooting tip, Josh!
... either that, or there's something wonky about my local setup, perhaps associated w/ recent upgrade to R 3.6.0. Running tests.match_on.R I see:
Investigating a little gives:
(A second test failure at test.match_on.R:136 is more or less the same thing.)
No indication of this failure in CRAN's checks. Locally I have a pair of commits ( f05a004 , c5efb27 ) relaxing the tolerance so that the tests pass, but I'll hold off a bit on publishing them to the master branch here on GH, in case I determine that the problem was specific to me.