Closed brancengregory closed 1 year ago
Related to #93 and #94
@andrewjbe
Info on how dbGetQuery
works under the hood: https://dbi.r-dbi.org/reference/dbsendquery#the-data-retrieval-flow-1
TLDR:
We currently get records back by doing one of two things:
A) calling dbGetQuery
on a db connection and SQL query string
or B) using collect
on a tbl
equivalent object
collect
is implemented here: https://github.com/tidyverse/dbplyr/blob/7b0ed3047c0bb24b29fef58134b19b845e194a45/R/db-io.R#L140
Under the hood, it is doing what dbGetQuery
does as well, which is call a combination of dbSendQuery
and dbFetch
.
We can make our own ojo_collect()
function that also wraps those functions, but adds additional features like some kind of progress indicator, time estimate, etc.
Relevant conversation in {RPostgres}
: https://github.com/r-dbi/RPostgres/issues/299
Quite tagential but related: https://github.com/rstudio/pool/issues/83#issuecomment-812602134
More context that highlights some under the hood {RPostgres}
code. While it would be ideal if our solution doesn't mess with these, since it would then require a rewrite with a different db backend, it is technically an option, and if the functionality were awesome enough, we could make that call:
Before I start trying to implement these suggestions, here's what I have so far on the data-pull-spinner
branch. This is what I'm kind of aiming for the end result to look like:
As you can see though, it's kinda slow for larger queries because it repeats work (i.e. using tally()
to count the rows and then using collect()
to actually download them), which we can fix using your suggestions. But the basic idea is to display an affirmation that results were found (including the number), and then display a cli message for every step in the process as its downloading, even though there's only really one "step" right now. I think the only way we could get an actual progress bar / spinner is to have our new version of ojo_collect()
download the results row by row or paginated somehow.
dbFetch
with RPostgres
allows us to paginate the response!
collect()
on tbl_sql's is implemented here: https://github.com/tidyverse/dbplyr/blob/7b0ed3047c0bb24b29fef58134b19b845e194a45/R/verb-compute.R#L111
It shows the right way to render the sql to be passed to dbSendQuery
and friends
{cli}
package