Open stanzheng opened 2 years ago
This appears to already be supported. If I'm missing something, please let me know so I don't run into a problem soon!
For example, with sqlalchemy-bigquery 1.4.4:
import sqlalchemy
import sqlalchemy.ext.declarative
import sqlalchemy_bigquery
import os
Base = sqlalchemy.ext.declarative.declarative_base()
class Foo(Base):
__tablename__ = 'foo'
zap = sqlalchemy.Column(sqlalchemy.String, primary_key=True)
bar = sqlalchemy.Column(
sqlalchemy.ARRAY(
sqlalchemy_bigquery.STRUCT(
a=sqlalchemy.String,
b=sqlalchemy.String,
c=sqlalchemy.ARRAY(
sqlalchemy_bigquery.STRUCT(
x=sqlalchemy.String,
y=sqlalchemy.String)))))
engine = sqlalchemy.create_engine(f'bigquery://{os.environ["BQ_PROJECT_ID"]}/{os.environ["BQ_DATASET_ID"]}')
Base.metadata.create_all(engine)
This successfully creates a table with repeated structs:
And here is a more verbose example that includes inserts that I tested works: https://gist.github.com/jdimatteo/c223668095392e629ca01aadada4171c
Can support be added for representing all the sub-columns of an unnested array of structs? I am having to manually specify the columns that exist in an unnested array of structs instead of SQLalchemy "knowing" the columns that live in the unnested array of structs.
Also, this BigQuery documentation should probably be updated, since it mentions nothing about an unnested array of structs forming their own new columns in the result. Documentation: https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays#query_structs_in_an_array
Example (Incorrect result, participants results in 2 columns):
WITH Races AS (
SELECT "800M" AS race,
[STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)]
AS participants)
SELECT
race,
participant
FROM Races AS r
CROSS JOIN UNNEST(r.participants) AS participant;
+------+---------------------------------------+
| race | participant |
+------+---------------------------------------+
| 800M | {Rudisha, [23.4, 26.3, 26.4, 26.1]} |
| 800M | {Makhloufi, [24.5, 25.4, 26.6, 26.1]} |
| 800M | {Murphy, [23.9, 26, 27, 26]} |
| 800M | {Bosse, [23.6, 26.2, 26.5, 27.1]} |
| 800M | {Rotich, [24.7, 25.6, 26.9, 26.4]} |
| 800M | {Lewandowski, [25, 25.7, 26.3, 27.2]} |
| 800M | {Kipketer, [23.2, 26.1, 27.3, 29.4]} |
| 800M | {Berian, [23.7, 26.1, 27, 29.3]} |
+------+---------------------------------------+
Actual Result:
Thanks for stopping by to let us know something could be better!
PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.
Describe the solution you'd like
Ability to insert repeated nested STRUCT records. attribute to define repeated nested records
https://cloud.google.com/bigquery/docs/nested-repeated
Structs should support the bigquery REPEATED attribute on records.
A clear and concise description of what you want to happen. Describe alternatives you've considered
Serializing Records as Strings to deserialize in the client.
A clear and concise description of any alternative solutions or features you've considered. Additional context
tried using Arrays in conjunction with Structs
error stacktrace
Add any other context or screenshots about the feature request here.