Closed botan closed 3 hours ago
The example below requires bigquery-public-data.usa_names.usa_1910_current
table to be copied to your project.
library(bigrquerystorage)
library(glue)
billing <- Sys.getenv("GCP_BILLING_PROJECT_ID")
fields <- c("name", "number", "state")
bigquery_storage_api_rows <-
bqs_table_download(
x = glue("{billing}.usa_names.usa_1910_current"),
selected_fields = fields,
row_restriction = 'state = "WA"'
)
fields == colnames(bigquery_storage_api_rows)
#> [1] FALSE FALSE FALSE
colnames(bigquery_storage_api_rows)
#> [1] "state" "name" "number"
Since this is supposed to mimic the BigQuery Read API. https://cloud.google.com/bigquery/docs/reference/storage/rpc/google.cloud.bigquery.storage.v1#tablereadoptions
The order of the fields in the read session schema is derived from the table schema and does not correspond to the order in which the fields are specified in this list.
Ordering would be done from a select statement or a method using this API.
One way from the example above:
library(bigrquerystorage)
library(glue)
billing <- Sys.getenv("GCP_BILLING_PROJECT_ID")
fields <- c("name", "number", "state")
bigquery_storage_api_rows <-
bqs_table_download(
x = glue("{billing}.usa_names.usa_1910_current"),
selected_fields = fields,
row_restriction = 'state = "WA"'
)[fields]
fields == colnames(bigquery_storage_api_rows)
> fields == colnames(bigquery_storage_api_rows)
[1] TRUE TRUE TRUE
> colnames(bigquery_storage_api_rows)
[1] "name" "number" "state"
That sounds reasonable, thanks!
It seems that the BigQuery Storage Read API doesn't respect the order of fields specified by the user and instead returns the fields in the original table order. I was wondering if it would be a better user experience if bigrquerystorage reordered the columns according to the user-supplied
selected_fields
order.