ThomUK / SPCreporter

Creates Metric Reports using Statistical Process Control in the NHS style
https://thomuk.github.io/SPCreporter/
Other
6 stars 2 forks source link

as.integer(.data$time_between) calculates time between dttms as seconds not days #96

Closed francisbarton closed 6 months ago

francisbarton commented 9 months ago

Reprex:

library(lubridate, warn.conflicts = FALSE)

test_data <- list(
  month = tibble::tribble(
    ~ref, ~measure_name, ~comment, ~`2023-12-01`,
    "100", "More data", "Test", 1
  ),
  events = tibble::tribble(
    ~ref, ~measure_name, ~comment, ~event_date_or_datetime,
    "1", "Test data", "Reprex for duplicate date", as_date("2023-12-18"),
    "1", "Test data", "Reprex for duplicate date", as_date("2023-12-18")
  )
)

data_cutoff_dttm <- as_datetime("2023-12-31 23:59:59")

e_data <- test_data |>
  purrr::pluck("events")

e_data_time_between <- SPCreporter:::process_event_data_t(e_data, data_cutoff_dttm)

e_data_time_between
#> # A tibble: 2 × 6
#>   aggregation ref   measure_name comment date         value
#>   <chr>       <chr> <chr>        <lgl>   <date>       <int>
#> 1 none        1     Test data    NA      2023-12-18       0
#> 2 none        1     Test data    NA      2023-12-31 1209599

I am not sure if this is desired behaviour that is then handled downstream, but I think the desired behaviour is to calculate a number of days between events. Currently it appears to calculate the lag in seconds.

Also, the initial event is removed from the data (there are two events on 2023-12-18 in the measure data).

ThomUK commented 6 months ago

I think we need to use seconds or decimal days internally to the package. If we have the time data I'd prefer to use it internally in the calcs, and do the rounding at the final presentation step (if rounding is needed at all).