USAID-OHA-SI / mindthegap

Munges and returns estimates from UNAIDS data.
https://usaid-oha-si.github.io/mindthegap/
Other
4 stars 0 forks source link

Enhance pull_unaids() function to return global estimates using range_speedread() #13

Closed tessam30 closed 2 years ago

tessam30 commented 2 years ago

Issue It takes around 400 seconds to read in the full global HIV estimates (HIV Estimates - Integer). What if we use the range_speedread() function to speed up the reading process. Jenny Bryan explains the gist of this here.

Unit: seconds
  expr       min        lq      mean    median        uq       max neval
 speed   7.99622   7.99622   7.99622   7.99622   7.99622   7.99622     1
  pull 397.76969 397.76969 397.76969 397.76969 397.76969 397.76969     1

Actions suggested

Here is a code snippet if you want to test the speed of each function:

library(googlesheets4)
library(mindthegap)
library(microbenchmark)

 glbl_id <- "UNAIDS 2021 Clean Estimates [update to sheet id]"

  speed_read <- function(){
   df <-  range_speedread("glbl_id", sheet = "HIV Estimates - Integer") 
   return(df)
  }

  pull_read <- function(){
    df <- googlesheets4::read_sheet("glbl_id", sheet = "HIV Estimates - Integer")
    return(df)
  }

  microbenchmark(
    speed = speed_read(),
    pull = pull_read(),
    times = 1
  )