Closed juliasilge closed 9 months ago
Just to reassure me that this doesn't have a big performance impact, could you give a couple of benchmarks of reading/writing a pin before and after this change?
It's definitely slower. Here is with current CRAN:
bench::mark(
pins = {
library(pins)
board <- board_connect()
name <- pin_write(board, 1, pins:::random_pin_name())
testthat::expect_equal(pin_read(board, name), 1)
pin_delete(board, name)
},
iterations = 5
)
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-NfgrIuJ1Yf'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-aqX9EJ6vgk'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-Tobj1QG7JD'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-UsFthxGSFR'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-2DtQsuQjFY'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-OaKWLFT8vB'
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 pins 3.64s 3.8s 0.264 14.8MB 0.685
Created on 2023-10-04 with reprex v2.0.2
And here is with this PR:
bench::mark(
pins = {
library(pins)
board <- board_connect()
name <- pin_write(board, 1, pins:::random_pin_name())
testthat::expect_equal(pin_read(board, name), 1)
pin_delete(board, name)
},
iterations = 5
)
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-jJODh5a5wa'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-oe8m7ajoYw'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-sGcG0aUn9t'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-9pB1xPTZpm'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-RngIENxoHL'
#> Connecting to Posit Connect 2023.07.0 at <https://colorado.posit.co/rsc>
#> Guessing `type = 'rds'`
#> Writing to pin 'julia.silge/test-veDVBVQXWv'
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 × 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 pins 4.01s 4.05s 0.245 15MB 0.686
Created on 2023-10-04 with reprex v2.0.2
In this example, we look up what content on Connect belongs to name
three times (write, read, delete). With CRAN right now, we look that up one time and then after that look in the content cache, while with this PR we look it up every time and it's 5-10% slower. If we increased that a lot (say, write to the same pin a dozen times) the difference would increase more and more with each additional API call that we weren't doing before.
The argument for making the change in this PR would be that this gain in speed isn't worth it because it is difficult for us to safely accommodate when the local cache gets out of sync, and then it's super confusing to the user.
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
In #667 we moved from caches saved to disk to environment caches, but this still causes problems that are very confusing and hard to reason about, as @ryjohnson09 demonstrates in #789. I would like to propose we just throw this whole idea away; my opinion is that these Connect user and content caches are not gaining us much (especially now that they are only environment caches) and when things go wrong, it's pretty horrible.