Open schuemie opened 1 year ago
I'm not sure if CRAN allows that or not. I think it's generally not recommended to set options for the user but we could check the option and print a message if it is not set.
An alternative to pull would be to use select
then collect
library(Andromeda)
a <- andromeda(cars = cars)
a$cars %>% pull(speed)
#> [1] 4 4 7 7 8 9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14 15 15
#> [26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24 24 25
a$cars %>% select(speed) %>% collect() %>% {.[["speed"]]}
#> [1] 4 4 7 7 8 9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14 15 15
#> [26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24 24 25
Created on 2023-03-17 with reprex v2.0.2
This example is using the current release but it should work with the new release as well.
I commented on the issue: https://github.com/apache/arrow/issues/32705
I think I'd propose printing a message or warning in Andromeda's onload if the options(arrow.pull_as_vector) is not set.
Alternatively I could provide a function that would have the same behavior as pull (returns vector) and we could switch to that.
Do you think it would be possible or even advantageous to use chucked arrays instead of R vectors? One benefit we have in Andromeda is that
Another option is to add withr::local_options()
at the beginning of functions that use pull
on Andromeda tables.
When using pull(), we get this warning:
This warning is annoying, and the advertised new behavior will break many HADES packages when it becomes the default in some future release of arrow.
I don't have a good solution here. Is there an alternative to pull()? Should we set
options(arrow.pull_as_vector)
in Andromeda's onLoad() function? (Would CRAN allow that?)