bcgov / rcaaqs

An R package to facilitate the calculation of air quality metrics according to Canadian Ambient Air Quality Standards
Apache License 2.0
16 stars 5 forks source link

Inconsistent rounding Issue #66

Open jeromerobles opened 6 months ago

jeromerobles commented 6 months ago

Round results for rcaaqs::pm_24h_caaqs() is inconsistent depending on the range of data set. In the example below, if data provided is from 2011-2015, the metric_value for 2015 is 28, but when this same data is filtered to 2013-2015, the metric_value for 2015 is 29. Noted that the annual values for each remain the same.

`# -rounding check library(dplyr) library(lubridate)

-create pseudo data

data <- tibble( year = c(2011,2012,2013,2014,2015), value=c(27.8,27.9,32.6,28.5,24.4) ) start_date <- ymd_hm('2011-01-01 00:00') end_date <- ymd_hm('2015-12-31 23:00')

dates <- seq(from = start_date,to = end_date, by='hour') df <- tibble( date_time=dates ) %>% cross_join(tibble(site = 'site1')) %>% mutate(year = year(date_time)) %>% left_join(data)

df1 <- df df2 <- df %>% filter(year>=2013)

test1 <- rcaaqs::pm_24h_caaqs(data = df1) test2 <- rcaaqs::pm_24h_caaqs(data = df2) test1$caaqs test2$caaqs`

test1$caaqs

A tibble: 5 × 10

caaqs_year min_year max_year n_years metric metric_value caaqs flag_daily_incomplete flag_yearly_incomplete flag_two_of_three_years

1 2011 2011 2011 1 pm2.5_24h NA Insufficient Data NA FALSE FALSE 2 2012 2011 2012 2 pm2.5_24h 28 Not Achieved NA FALSE TRUE 3 2013 2011 2013 3 pm2.5_24h 29 Not Achieved NA FALSE FALSE 4 2014 2012 2014 3 pm2.5_24h 30 Not Achieved NA FALSE FALSE 5 2015 2013 2015 3 pm2.5_24h 28 Not Achieved NA FALSE FALSE > test2$caaqs # A tibble: 3 × 10 caaqs_year min_year max_year n_years metric metric_value caaqs flag_daily_incomplete flag_yearly_incomplete flag_two_of_three_years 1 2013 2013 2013 1 pm2.5_24h NA Insufficient Data NA FALSE FALSE 2 2014 2013 2014 2 pm2.5_24h 31 Not Achieved NA FALSE TRUE 3 2015 2013 2015 3 pm2.5_24h 29 Not Achieved NA FALSE FALSE
stephhazlitt commented 6 months ago

@jeromerobles I cannot seem to replicate the issue.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#> 
#>     date, intersect, setdiff, union

data <- tibble(
  year = c(2011, 2012, 2013, 2014, 2015),
  value = c(27.8, 27.9, 32.6, 28.5, 24.4)
)

start_date <- ymd_hm('2011-01-01 00:00')
end_date <- ymd_hm('2015-12-31 23:00')

dates <- seq(from = start_date, to = end_date, by = 'hour')

df <- tibble(date_time = dates) |>
  cross_join(tibble(site = 'site1')) |>
  mutate(year = year(date_time)) |>
  left_join(data)
#> Joining with `by = join_by(year)`

df1 <- df
df2 <- df |>
  filter(year >= 2013)

test1 <- rcaaqs::pm_24h_caaqs(data = df1)
#> Calculating PM 2.5 daily average
#> Calculating PM 2.5 annual 98th percentile
#> Calculating PM 2.5 24h CAAQS metric
test2 <- rcaaqs::pm_24h_caaqs(data = df2)
#> Calculating PM 2.5 daily average
#> Calculating PM 2.5 annual 98th percentile
#> Calculating PM 2.5 24h CAAQS metric

testdf1 <- test1[["caaqs"]]
testdf2 <- test2[["caaqs"]]

estimate1 <-
  testdf1 |> filter(caaqs_year == 2015) |> pull(metric_value)
estimate2 <-
  testdf2 |> filter(caaqs_year == 2015) |> pull(metric_value)

setequal(estimate1, estimate2)
#> [1] TRUE

Are we using different versions of rcaaqs or anything else under the hood?

> session_info()
─ Session info ──────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       macOS Ventura 13.6.1
 system   aarch64, darwin20
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Vancouver
 date     2024-03-01
 rstudio  2023.12.1+402 Ocean Storm (desktop)
 pandoc   3.1.11.1 @ /opt/homebrew/bin/pandoc

─ Packages ──────────────────────────────────────
 package     * version    date (UTC) lib source
 brio          1.1.4      2023-12-10 [1] CRAN (R 4.3.1)
 cachem        1.0.8      2023-05-01 [1] CRAN (R 4.3.0)
 cli           3.6.2      2023-12-11 [1] CRAN (R 4.3.1)
 colorspace    2.1-0      2023-01-23 [1] CRAN (R 4.3.0)
 devtools    * 2.4.5      2022-10-11 [1] CRAN (R 4.3.0)
 digest        0.6.34     2024-01-11 [1] CRAN (R 4.3.1)
 dplyr       * 1.1.4      2023-11-17 [1] CRAN (R 4.3.1)
 ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.3.0)
 fansi         1.0.6      2023-12-08 [1] CRAN (R 4.3.1)
 fastmap       1.1.1      2023-02-24 [1] CRAN (R 4.3.0)
 fs            1.6.3      2023-07-20 [1] CRAN (R 4.3.0)
 generics      0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
 ggplot2       3.5.0      2024-02-23 [1] CRAN (R 4.3.1)
 glue          1.7.0      2024-01-09 [1] CRAN (R 4.3.1)
 gtable        0.3.4      2023-08-21 [1] CRAN (R 4.3.0)
 htmltools     0.5.7      2023-11-03 [1] CRAN (R 4.3.1)
 htmlwidgets   1.6.4      2023-12-06 [1] CRAN (R 4.3.1)
 httpuv        1.6.14     2024-01-26 [1] CRAN (R 4.3.1)
 later         1.3.2      2023-12-06 [1] CRAN (R 4.3.1)
 lifecycle     1.0.4      2023-11-07 [1] CRAN (R 4.3.1)
 lubridate   * 1.9.3      2023-09-27 [1] CRAN (R 4.3.1)
 magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
 memoise       2.0.1      2021-11-26 [1] CRAN (R 4.3.0)
 mime          0.12       2021-09-28 [1] CRAN (R 4.3.0)
 miniUI        0.1.1.1    2018-05-18 [1] CRAN (R 4.3.0)
 munsell       0.5.0      2018-06-12 [1] CRAN (R 4.3.0)
 pillar        1.9.0      2023-03-22 [1] CRAN (R 4.3.0)
 pkgbuild      1.4.3      2023-12-10 [1] CRAN (R 4.3.1)
 pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
 pkgload       1.3.4      2024-01-16 [1] CRAN (R 4.3.1)
 praise        1.0.0      2015-08-11 [1] CRAN (R 4.3.0)
 profvis       0.3.8      2023-05-02 [1] CRAN (R 4.3.0)
 promises      1.2.1      2023-08-10 [1] CRAN (R 4.3.0)
 purrr         1.0.2      2023-08-10 [1] CRAN (R 4.3.0)
 R6            2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
 rcaaqs        0.3.1.9000 2023-10-19 [1] Github (bcgov/rcaaqs@fdfd30a)
 Rcpp          1.0.12     2024-01-09 [1] CRAN (R 4.3.1)
 remotes       2.4.2.1    2023-07-18 [1] CRAN (R 4.3.0)
 rlang         1.1.3      2024-01-10 [1] CRAN (R 4.3.1)
 rstudioapi    0.15.0     2023-07-07 [1] CRAN (R 4.3.0)
 scales        1.3.0      2023-11-28 [1] CRAN (R 4.3.1)
 sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
 shiny         1.8.0      2023-11-17 [1] CRAN (R 4.3.1)
 stringi       1.8.3      2023-12-11 [1] CRAN (R 4.3.1)
 stringr       1.5.1      2023-11-14 [1] CRAN (R 4.3.1)
 testthat    * 3.2.1      2023-12-02 [1] CRAN (R 4.3.1)
 tibble        3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
 tidyr         1.3.1      2024-01-24 [1] CRAN (R 4.3.1)
 tidyselect    1.2.0      2022-10-10 [1] CRAN (R 4.3.0)
 timechange    0.3.0      2024-01-18 [1] CRAN (R 4.3.1)
 urlchecker    1.0.1      2021-11-30 [1] CRAN (R 4.3.0)
 usethis     * 2.2.3      2024-02-19 [1] CRAN (R 4.3.1)
 utf8          1.2.4      2023-10-22 [1] CRAN (R 4.3.1)
 vctrs         0.6.5      2023-12-01 [1] CRAN (R 4.3.1)
 withr         3.0.0      2024-01-16 [1] CRAN (R 4.3.1)
 xtable        1.8-4      2019-04-21 [1] CRAN (R 4.3.0)