Closed Nic-Chr closed 7 months ago
Below I include a benchmark of the above function compared to phsmethods::extract_fin_year()
as well as tests to ensure they produce the same financial years.
library(tidyverse)
library(lubridate)
library(bench)
extract_fin_year <- function(date){
if (!inherits(date, c("Date", "POSIXt"))) {
cli::cli_abort("{.arg date} must be a {.cls Date} or {.cls POSIXt} vector,\n not a {.cls {class(date)}} vector.")
}
y <- lubridate::year(date)
m <- lubridate::month(date)
fy <- y - ( (m - 3) %/% 1 <= 0)
next_fy <- (fy + 1) %% 100
out <- sprintf("%.4d/%02d", fy, next_fy)
out[is.na(date)] <- NA_character_
out
}
# Single date
x <- dmy(01012023)
mark(extract_fin_year(x),
phsmethods::extract_fin_year(x))
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 extract_fin_year(x) 52µs 58µs 15339. 5.88KB 17.7
#> 2 phsmethods::extract_fin_year(x) 150µs 164µs 5663. 47.83KB 19.1
# Single date-time
x <- as_datetime(x)
mark(extract_fin_year(x),
phsmethods::extract_fin_year(x))
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 extract_fin_year(x) 42.3µs 46.3µs 19650. 0B 19.3
#> 2 phsmethods::extract_fin_year(x) 118.8µs 131.3µs 6957. 0B 21.4
# Lots of dates
x <- dmy(01012000) + days(0:10^6)
mark(extract_fin_year(x),
phsmethods::extract_fin_year(x),
check = FALSE)
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 2 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 extract_fin_year(x) 1.59s 1.59s 0.627 202MB 2.51
#> 2 phsmethods::extract_fin_year(x) 6.62s 6.62s 0.151 549MB 1.06
# Helper to conert phs::extract_fin_year to newer format
### This is ONLY FOR TESTING and not to be used in any production code.
phs_convert_fin_year <- function(x){
# All digits before slash
digits1 <- str_extract(x, regex("^[:digit:]*(?=\\/)"))
# All digits after slash
digits2 <- str_extract(x, regex("(?<=\\/)[:digit:]*"))
# Use YYYY/YY format
out <- sprintf("%.4d/%02d", as.integer(digits1), as.integer(digits2) %% 100L)
out[is.na(x)] <- NA_character_
out
}
# Generate random dates
x <- seq(dmy(01011900), today(), by = "day")
x <- sample(x, size = 10^5, replace = TRUE)
x[sample.int(length(x), 10^2, TRUE)] <- NA
# Here we're making sure the new function produces correct financial years.
identical(extract_fin_year(x),
phs_convert_fin_year(
phsmethods::extract_fin_year(x)
))
#> [1] TRUE
Created on 2023-11-16 with reprex v2.0.2
Thanks for spotting and reporting this issue @Nic-Chr
If you have time, would you be able to put this together into a PR? Other than the code you've quoted above I think we would need to add some more tests (clearly some cases are being missed from the current set of tests).
Give me a message if you have any questions about the PR or testthat.
If you won't have time, just let us know and (@Tina815 or) I will implement it at some point!
Thanks @Moohan, sounds good. I'll go ahead and put this into a PR with some additional tests.
Hello, the documentation for
extract_fin_year
states that the "format for financial year is YYYY/YY" though for certain years this format is not applied. Please see the below reprex.Created on 2023-11-16 with reprex v2.0.2