Closed agila5 closed 11 months ago
See #205 and the discussion in https://github.com/tidyverse/readr/issues/1266
When testing the existing approach I get the following:
# current approach remotes::install_github("ropensci/stats19", "master", upgrade = "never", quiet = TRUE) options(width = 120) microbenchmark::microbenchmark( one_year = suppressMessages(stats19::get_stats19(2019, silent = TRUE)), one_year_filter = { suppressMessages(crashes <- stats19::get_stats19(2019, silent = TRUE)) crashes_london <- crashes[crashes$police_force == "City of London", ] }, multiple_years = suppressMessages(stats19::get_stats19(2015:2019, silent = TRUE)), multiple_years_filter = { suppressMessages(crashes <- stats19::get_stats19(2015:2019, silent = TRUE)) crashes_london <- crashes[crashes$police_force == "City of London", ] }, times = 5L ) #> Unit: milliseconds #> expr min lq mean median uq max neval cld #> one_year 728.484 758.3625 857.1061 886.6820 931.4195 980.5827 5 a #> one_year_filter 691.783 710.3164 742.4217 728.0939 779.0534 802.8620 5 a #> multiple_years 5721.792 5732.5708 5912.5875 5925.7921 6090.2348 6092.5478 5 b #> multiple_years_filter 5806.420 6018.1828 6389.2668 6188.9528 6598.0124 7334.7656 5 b
Created on 2021-08-24 by the reprex package (v2.0.0)
while with just lazy = FALSE in all scenarios I get the following:
lazy = FALSE
# lazy = FALSE approach remotes::install_github("ropensci/stats19", "test-lazy", upgrade = "never", quiet = TRUE) options(width = 120) microbenchmark::microbenchmark( one_year = suppressMessages(stats19::get_stats19(2019, silent = TRUE)), one_year_filter = { suppressMessages(crashes <- stats19::get_stats19(2019, silent = TRUE)) crashes_london <- crashes[crashes$police_force == "City of London", ] }, multiple_years = suppressMessages(stats19::get_stats19(2015:2019, silent = TRUE)), multiple_years_filter = { suppressMessages(crashes <- stats19::get_stats19(2015:2019, silent = TRUE)) crashes_london <- crashes[crashes$police_force == "City of London", ] }, times = 5L ) #> Unit: milliseconds #> expr min lq mean median uq max neval cld #> one_year 702.4651 726.7535 751.8321 730.6509 732.2352 867.0559 5 a #> one_year_filter 673.2410 684.2855 778.9064 733.4660 855.8780 947.6614 5 a #> multiple_years 4476.7977 4489.1523 4614.8126 4517.1048 4519.3307 5071.6776 5 b #> multiple_years_filter 4746.5225 4991.9151 5301.0332 5073.7521 5099.0314 6593.9451 5 b
I think we should test it a little bit more (the previous tests were run on Ubuntu 18.04 VM) and then maybe just adopt lazy = FALSE.
:+1:
The results on my windows laptop are more or less identical
See #205 and the discussion in https://github.com/tidyverse/readr/issues/1266
When testing the existing approach I get the following:
Created on 2021-08-24 by the reprex package (v2.0.0)
while with just
lazy = FALSE
in all scenarios I get the following:Created on 2021-08-24 by the reprex package (v2.0.0)
I think we should test it a little bit more (the previous tests were run on Ubuntu 18.04 VM) and then maybe just adopt
lazy = FALSE
.