Closed dibbles21 closed 1 year ago
Unfortunately this isn't something that can work with tidytable. In these cases I would just recommend using dplyr/dbplyr/arrow.
One option is to use unloadNamespace("dbplyr")
to detach packages before using tidytable
.
library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)
# Querying a fake database
db_table <- memdb_frame(x = 1:3, y = c("a", "a", "b"))
df <- db_table %>%
select(x, y) %>%
collect()
# Switch over to tidytable
unloadNamespace("dbplyr")
unloadNamespace("dplyr")
library(tidytable, warn.conflicts = FALSE)
df %>%
mutate(double_x = x * 2)
#> # A tidytable: 3 × 3
#> x y double_x
#> <int> <chr> <dbl>
#> 1 1 a 2
#> 2 2 a 4
#> 3 3 b 6
Another option would be to use dtplyr, which allows you to continue using the piping workflow. Though dtplyr has less functionality than tidytable, it integrates much better with packages like dbplyr/arrow.
FYI I am a co-author of dtplyr, so I am working on getting more functions into dtplyr. I doubt it will ever have as many features as tidytable, but hopefully we can get it somewhat close.
Also if you do use dtplyr, I would recommend installing the development version from GitHub - we are in the process of releasing a new version to CRAN.
# Install the latest version
# devtools::install_github("tidyverse/dtplyr")
library(dplyr, warn.conflicts = FALSE)
library(dbplyr, warn.conflicts = FALSE)
library(dtplyr)
db_table <- memdb_frame(x = 1:3, y = c("a", "a", "b"))
# Using dbplyr & dtplyr together
df <- db_table %>%
select(x, y) %>%
collect() %>%
lazy_dt() %>%
mutate(double_x = x * 2)
df
#> Source: local data table [3 x 3]
#> Call: copy(`_DT1`)[, `:=`(double_x = x * 2)]
#>
#> x y double_x
#> <int> <chr> <dbl>
#> 1 1 a 2
#> 2 2 a 4
#> 3 3 b 6
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results
Hope this helps - if you have any questions let me know.
Thank you Mark for your response, it's great to know what's possible. For now I will continue using dplyr functions before collect. What I love about tidytable is that I don't have to change my legacy dplyr code, I just need to load tidytable after dplyr.
Do you have a rough idea when the next dtplyr version will go to CRAN? It's company policy to only use CRAN packages and not dev versions 😁.
Do you have a rough idea when the next dtplyr version will go to CRAN? It's company policy to only use CRAN packages and not dev versions
Sometime in the next week I would guess. It's been submitted, we're just waiting on CRAN approval.
Thank you
New CRAN version of dtplyr is out now 🥳
Thanks!!
HI Mark,
Perhaps this isn't a tidytable issue, but I find that tidytable functions don't use when querying databases/arrow files when used before dplyr::collect(). I find that I have to specify to use the dplyr versions in such cases. Is this something that could poentially work with tidytable?
Thanks,
Dan