apache / arrow-adbc

Database connectivity API standard and libraries for Apache Arrow
https://arrow.apache.org/adbc/
Apache License 2.0
360 stars 89 forks source link

BigQuery adbc_ingest Does Not work #2172

Open WillAyd opened 1 week ago

WillAyd commented 1 week ago

What happened?

When trying to ingest data into BigQuery I get a ProgrammingError: INVALID_ARGUMENT: unknown statement string type option adbc.ingest.target_table error

Stack Trace

Cell In[29], line 14
     11 tbl = pa.Table.from_pydict({"col": [0, 1, 2]})
     13 with adbc_driver_bigquery.dbapi.connect(db_kwargs) as conn, conn.cursor() as cur:
---> 14     cur.adbc_ingest(table_name="foo", data=tbl)

File ~/clones/arrow-adbc/python/adbc_driver_manager/adbc_driver_manager/dbapi.py:895, in Cursor.adbc_ingest(self, table_name, data, mode, catalog_name, db_schema_name, temporary)
    891 if db_schema_name is not None:
    892     options[
    893         adbc_driver_manager.StatementOptions.INGEST_TARGET_DB_SCHEMA.value
    894     ] = db_schema_name
--> 895 self._stmt.set_options(**options)
    897 if temporary:
    898     self._stmt.set_options(
    899         **{
    900             adbc_driver_manager.StatementOptions.INGEST_TEMPORARY.value: "true",
    901         }
    902     )

File ~/clones/arrow-adbc/python/adbc_driver_manager/adbc_driver_manager/_lib.pyx:1482, in adbc_driver_manager._lib.AdbcStatement.set_options()

File ~/clones/arrow-adbc/python/adbc_driver_manager/adbc_driver_manager/_lib.pyx:260, in adbc_driver_manager._lib.check_error()

ProgrammingError: INVALID_ARGUMENT: unknown statement string type option `adbc.ingest.target_table`
> /home/demo/code/adbc14/adbc_driver_manager/_lib.pyx(260)adbc_driver_manager._lib.check_error()

How can we reproduce the bug?

import adbc_driver_bigquery.dbapi
from adbc_driver_bigquery import DatabaseOptions
import pyarrow as pa

db_kwargs = {
    DatabaseOptions.PROJECT_ID.value: "some-demo-project-1234",
    DatabaseOptions.DATASET_ID.value: "demo_dataset",
    DatabaseOptions.TABLE_ID.value: "foo",
}

tbl = pa.Table.from_pydict({"col": [0, 1, 2]})

with adbc_driver_bigquery.dbapi.connect(db_kwargs) as conn, conn.cursor() as cur:
    cur.adbc_ingest(table_name="foo", data=tbl)

Environment/Setup

No response

joellubi commented 1 week ago

Hi @WillAyd. The bigquery driver is still not feature-complete, and bulk ingestion is among the features that still need to be implemented. I thought we already had an issue tracking the remaining work but I can't find it now.

I'll open an issue today to help track the features that still need to be added.