andrewallenbruce / provider

Public Healthcare Provider APIs :stethoscope:
https://andrewallenbruce.github.io/provider/
Other
18 stars 2 forks source link

Benchmarks: `tidy` Parameter #23

Closed andrewallenbruce closed 7 months ago

andrewallenbruce commented 8 months ago

affiliations()

library(bench)
library(provider)
library(ggplot2)

res <- bench::mark(
  affiliations_tidy = affiliations(parent_ccn = 670055),
  affiliations_raw = affiliations(parent_ccn = 670055, tidy = FALSE, na.rm = FALSE),
  check = FALSE,
  iterations = 5)

res |> dplyr::select(expression:mem_alloc, n_itr)
#> # A tibble: 2 × 5
#>   expression             min   median `itr/sec` mem_alloc
#>   <bch:expr>        <bch:tm> <bch:tm>     <dbl> <bch:byt>
#> 1 affiliations_tidy    287ms    308ms      3.16    17.7MB
#> 2 affiliations_raw     190ms    255ms      3.88   115.6KB

res |> autoplot("ridge")

Created on 2023-10-18 with reprex v2.0.2

andrewallenbruce commented 8 months ago

Functions with a xxxx_years() helper

One call consists of a request/response pair to the API:

  1. utils_years(): check if the year being requested is available
  2. httr2: return the data

Mapping with purrr adds another(!) req/res pair to utils_years() for each year in the iteration.

Two identical calls within 10 secs (to account for any http caching), returning a data.frame with 3,125 rows and 73 columns:

library(provider)
library(tictoc)

tic()
map_dfr(util_years(), ~utilization(year = .x, city = "Valdosta", state = "GA", type = "provider", tidy = FALSE))
toc()

#> 14.41 sec elapsed
#> 8.81 sec elapsed

5.6 second difference [~40% reduction]