TobikoData / sqlmesh

Efficient data transformation and modeling framework that is backwards compatible with dbt.
https://sqlmesh.com
Apache License 2.0
1.82k stars 160 forks source link

Fix: Bigquery support of complex nested types #3190

Closed Themiscodes closed 1 month ago

Themiscodes commented 1 month ago

Fix issue of constructing schema fields for nested and array types, by using the function __get_bq_schemafield from https://github.com/TobikoData/sqlmesh/pull/3185 when constructing a record or array schema field and the function _build_nested_fields to recursively build nested fields, when appending to a record fixes: #3184

plaflamme commented 1 month ago

I don't think this will work if the field being modified is ARRAY<STRUCT<>>, I believe this branch needs to deal with this case.

Also, I was hoping some unit tests would be added as part of this work, the cases we have are:

Themiscodes commented 1 month ago

You're right @plaflamme ARRAY<STRUCT<...>> could be fixed by passing the field.mode instead of nullable in the branch you mentioned, but going over two levels of nesting (e.g. STRUCT<ARRAY<STRUCT<ARRAY<...>>>>) would still not be able to get resolved correctly. I will look into how to handle these cases as well and come up with a fix and tests for them. Thanks for your example use cases they're very helpful!

plaflamme commented 1 month ago

@Themiscodes Yeah, this is why I refrained from trying to make the fix, it's a bit tricky and I wasn't sure I understood the current code fully.

But from what I understand, if the table looks like this:

id INT64,
complex STRUCT<foo ARRAY<STRUCT<bar STRING>>>

and we try to alter the nested foo to something like this:

id INT64,
complex STRUCT<foo ARRAY<STRUCT<bar STRING, baz INT64>>>

Then, as long as we run the new method from the top-level complex field, it should just work. The part that wasn't clear to me is what's in the nested_fields_to_add argument. If that contains foo, then yeah, this is much harder, but if it contains complex, then it should just work.