Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
Please run down the following list and make sure you've tried the usual "quick fixes":
Error:
/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/_pandas_helpers.py:267:
UserWarning: Unable to determine type for field 'json'.
warnings.warn("Unable to determine type for field
'{}'.".format(bq_field.name))
Traceback (most recent call last):
File "/home/jupyter/smapi/test_bq.py", line 12, in <module>
df = bqexecutor.execute_query(sql)
File "/home/jupyter/smapi/smapi/core/data_providers/sql/bq_executor.py",
line 119, in execute_query
querydf = rows.to_dataframe()
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/[table.py](https://table.py/)",
line 2151, in to_dataframe
progress_bar_type=progress_bar_type,
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/[table.py](https://table.py/)",
line 1840, in to_arrow
return pyarrow.Table.from_batches(record_batches)
File "pyarrow/table.pxi", line 3747, in pyarrow.lib.Table.from_batches
ValueError: Must pass schema, or at least one RecordBatch
Python version: Python 3.9.0
package version:
google-cloud-bigquery 3.10.0
google-cloud-bigquery-storage 2.19.1
pyarrow 8.0.0
pandas 1.5.3
How to reproduce?
test_bq.py
python
from google.cloud import bigquery
client = bigquery.Client()
table_id = "gdw-dev-gdml.abehsu.reproduce_issue"
# Define schema
schema = [
bigquery.SchemaField("field1", "STRING", mode="REQUIRED"),
bigquery.SchemaField("field2", "JSON", mode="REQUIRED"),
]
table = bigquery.Table(table_id, schema=schema)
table = client.create_table(table) # Make an API request.
print(
"Created table {}.{}.{}".format(table.project, table.dataset_id,
table.table_id)
)
# Perform a query.
QUERY = (
f"""
select * from {table_id}
"""
)
query_job = client.query(QUERY) # API request
df = query_job.to_dataframe()
print(df)
python test_bq.py
Created table gdw-dev-gdml.abehsu.reproduce_issue
Stack trace
/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/_pandas_helpers.py:267:
UserWarning: Unable to determine type for field 'field2'.
warnings.warn("Unable to determine type for field
'{}'.".format(bq_field.name))
Traceback (most recent call last):
File "/home/jupyter/smapi/test_bq.py", line 41, in <module>
df = query_job.to_dataframe()
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/job/[query.py](https://query.py/)",
line 1800, in to_dataframe
return query_result.to_dataframe(
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/[table.py](https://table.py/)",
line 2150, in to_dataframe
record_batch = self.to_arrow(
File
"/opt/conda/envs/smapi/lib/python3.9/site-packages/google/cloud/bigquery/[table.py](https://table.py/)",
line 1848, in to_arrow
return pyarrow.Table.from_batches(record_batches, schema=arrow_schema)
File "pyarrow/table.pxi", line 3747, in pyarrow.lib.Table.from_batches
ValueError: Must pass schema, or at least one RecordBatch
Making sure to follow these steps will guarantee the quickest resolution possible.
OP points out the mappings for BQ to arrow datatypes and it definitely appears that JSON is not included there OR in any of the other translation functions adjacent to that code.
Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
Please run down the following list and make sure you've tried the usual "quick fixes":
If you are still having issues, please be sure to include as much information as possible:
Environment details
Steps to reproduce
When bigquery return empty datase. bigquery python sdk will need to transfer [bq schema to arrow schema
https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/table.py#L1844-L1853),
so it will execute bq_to_arrow_data_type https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/_pandas_helpers.py#L225-L246
Looks like currently this no JSON type mapping]: https://github.com/googleapis/python-bigquery/blob/main/google/cloud/bigquery/_pandas_helpers.py#L147-L162)
Code example
Python version: Python 3.9.0
package version:
google-cloud-bigquery 3.10.0
google-cloud-bigquery-storage 2.19.1
pyarrow 8.0.0
pandas 1.5.3
How to reproduce?
test_bq.py
python test_bq.py
Created table gdw-dev-gdml.abehsu.reproduce_issue
Stack trace
Making sure to follow these steps will guarantee the quickest resolution possible.
Thanks!