r-dbi / bigrquery

An interface to Google's BigQuery from R.
https://bigrquery.r-dbi.org
Other
517 stars 182 forks source link

compute() function in dplyr is broken #596

Closed or-asher closed 10 months ago

or-asher commented 10 months ago

Hey, I heard there would be a new version of the package and so I started testing the new changes in our code. I use the bigrquery to enable BQ query construction and execution through dbplyr. Quickly I noticed that compute doesnt work correctly. The compute itself probably succeeds in saving the results to a temp table as it doesn't throw an exception.

bq_auth()
con <- DBI::dbConnect(
    bigquery(),
    project = 'project_name'
    dataset = 'dataset_name',
    billing = 'project_name'
  )
a <- tbl(con, 'dataset_name.some_table') |> compute()

However once you try to access to access the temp table it seems to have been provided an incorrect table name:

> a
Error in `bq_get()`:
! Not found: Table project_name:dataset_name.dbplyr_005 [notFound]
Run `rlang::last_trace()` to see where the error occurred.
> rlang::last_trace()
<error/bigrquery_notFound>
Error in `bq_get()`:
! Not found: Table project_name:dataset_name.dbplyr_005 [notFound]
---
Backtrace:
     ▆
  1. ├─base (local) `<fn>`(x)
  2. └─dbplyr:::print.tbl_sql(x)
  3.   ├─dbplyr:::cat_line(format(x, ..., n = n, width = width, n_extra = n_extra))
  4.   │ ├─base::cat(paste0(..., "\n"), sep = "")
  5.   │ └─base::paste0(..., "\n")
  6.   ├─base::format(x, ..., n = n, width = width, n_extra = n_extra)
  7.   └─pillar:::format.tbl(x, ..., n = n, width = width, n_extra = n_extra)
  8.     └─pillar:::format_tbl(...)
  9.       └─pillar::tbl_format_setup(...)
 10.         ├─pillar:::tbl_format_setup_dispatch(...)
 11.         └─pillar:::tbl_format_setup.tbl(...)
 12.           └─pillar:::df_head(x, n + 1)
 13.             ├─base::as.data.frame(head(x, n))
 14.             └─dbplyr:::as.data.frame.tbl_sql(head(x, n))
 15.               ├─base::as.data.frame(collect(x, n = n))
 16.               ├─dplyr::collect(x, n = n)
 17.               └─bigrquery:::collect.tbl_BigQueryConnection(x, n = n)
 18.                 └─bigrquery::bq_table_download(...)
 19.                   ├─bigrquery:::set_row_params(...)
 20.                   └─bigrquery::bq_table_nrow(x)
 21.                     └─bigrquery::bq_table_meta(x, fields = "numRows")
 22.                       └─bigrquery:::bq_get(url, query = list(fields = fields))
Run rlang::last_trace(drop = FALSE) to see 5 hidden frames.

If you try to use this temp table in any further query it will also fail as this table doesnt actually exist

hadley commented 10 months ago

I see this too:

library(bigrquery)

con <- DBI::dbConnect(bigquery(),
  project = bq_test_project(),
  dataset = "basedata"
)
bq_mtcars <- dplyr::tbl(con, "mtcars") |> dplyr::filter(cyl == 4)

temp <- dplyr::compute(bq_mtcars)
temp
#> Error in `bq_get()` at bigrquery/R/bq-table.R:79:3:
#> ! Not found: Table gargle-169921:basedata.dbplyr_001 [notFound]

Created on 2024-01-18 with reprex v2.0.2.9000

I have a test for compute() but it doesn't collect the data 😞

hadley commented 10 months ago

Thanks for discovering this! I was about to release bigrquery, so it's good to catch it before that!