r-quantities / units

Measurement units for R
https://r-quantities.github.io/units
175 stars 28 forks source link

weighted.mean() yields incorrect result when units are effectively [1] #363

Closed dholstius closed 8 months ago

dholstius commented 8 months ago

Summary

It seems that when the units on x are effectively [1], as with day/week or lb/ton, then stats::weighted.mean() yields the wrong answer.

Reprex

library(units)
#> Warning: package 'units' was built under R version 4.2.3
#> udunits database from /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/units/share/udunits/udunits2.xml

w <- c(1, 1) # keep it simple

wtd_mean <- function (x, w, ...) {
  mu <- sum(x * w, ...) / sum(w, ...)
  set_units(mu, units(x), mode = "character")
}

# just to illustrate that `wtd_mean()` does the right thing
# these are all correct
x <- set_units(c(2, 4), "lb/week")
mean(x)
#> 3 [lb/week]
wtd_mean(x, w)
#> 3 [lb/week]
weighted.mean(x, w)
#> 3 [lb/week]

# here, weighted.mean() gets the wrong answer
# note: 0.429 [unitless] would be equivalent to 3/7 [day/week]
x <- set_units(c(2, 4), "day/week")
mean(x)
#> 3 [d/week]
wtd_mean(x, w)
#> 3 [d/week]
weighted.mean(x, w)
#> 0.4285714 [d/week]

# it's not only units of time
x <- set_units(c(2, 4), "lb/ton")
mean(x)
#> 3 [lb/ton]
wtd_mean(x, w)
#> 3 [lb/ton]
weighted.mean(x, w)
#> 0.0015 [lb/ton]

Created on 2024-01-10 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.0 (2022-04-22) #> os macOS 14.2 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Los_Angeles #> date 2024-01-10 #> pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.2.3) #> digest 0.6.33 2023-07-07 [1] CRAN (R 4.2.0) #> evaluate 0.23 2023-11-01 [1] CRAN (R 4.2.0) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.2.3) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.0) #> fs 1.6.3 2023-07-20 [1] CRAN (R 4.2.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0) #> htmltools 0.5.7 2023-11-03 [1] CRAN (R 4.2.0) #> knitr 1.45 2023-10-30 [1] CRAN (R 4.2.0) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.2.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0) #> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0) #> R.utils 2.12.0 2022-06-28 [1] CRAN (R 4.2.0) #> Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.2.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.0) #> rlang 1.1.2 2023-11-04 [1] CRAN (R 4.2.0) #> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.2.0) #> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) #> styler 1.7.0 2022-03-13 [1] CRAN (R 4.2.0) #> tibble 3.2.1 2023-03-20 [1] CRAN (R 4.2.0) #> units * 0.8-5 2023-11-28 [1] CRAN (R 4.2.3) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.2.0) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.2.3) #> withr 2.5.2 2023-10-30 [1] CRAN (R 4.2.0) #> xfun 0.41 2023-11-01 [1] CRAN (R 4.2.0) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.2.3) #> #> [1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```