TobikoData / sqlmesh

Efficient data transformation and modeling framework that is backwards compatible with dbt.
https://sqlmesh.com
Apache License 2.0
1.85k stars 164 forks source link

Key Error When Using Lateral View Explode #338

Closed eakmanrq closed 1 year ago

eakmanrq commented 1 year ago

Repro Model:

MODEL(
    name db.test,
);

SELECT
    tb.col,
    cola,
    struct_col_2,
    ds
FROM
    tb
    lateral view explode(struct_col) AS struct_col_2

Error:

...
File ~/repos/sqlmesh/sqlmesh/core/model/definition.py:632, in SqlModel.validate_definition(self)
    627     if count > 1:
    628         raise_config_error(
    629             f"Found duplicate outer select name `{name}`", self._path
    630         )
--> 632 super().validate_definition()

File ~/repos/sqlmesh/sqlmesh/core/model/definition.py:478, in _Model.validate_definition(self)
    472 if len(self.partitioned_by) != len(unique_partition_keys):
    473     raise_config_error(
    474         "All partition keys must be unique in the model definition",
    475         self._path,
    476     )
--> 478 column_names = {c.lower() for c in self.columns_to_types}
    479 missing_keys = unique_partition_keys - column_names
    480 if missing_keys:

File ~/repos/sqlmesh/sqlmesh/core/model/definition.py:586, in SqlModel.columns_to_types(self)
    583     return self.columns_to_types_
    585 if self._columns_to_types is None:
--> 586     query = annotate_types(self._query_renderer.render())
    587     self._columns_to_types = {
    588         expression.alias_or_name: expression.type
    589         for expression in query.expressions
    590     }
    592 return self._columns_to_types

File ~/repos/sqlglot/sqlglot/optimizer/annotate_types.py:31, in annotate_types(expression, schema, annotators, coerces_to)
      8 """
      9 Recursively infer & annotate types in an expression syntax tree against a schema.
     10 Assumes that we've already executed the optimizer's qualify_columns step.
   (...)
     26     sqlglot.Expression: expression annotated with types
     27 """
     29 schema = ensure_schema(schema)
---> 31 return TypeAnnotator(schema, annotators, coerces_to).annotate(expression)

File ~/repos/sqlglot/sqlglot/optimizer/annotate_types.py:276, in TypeAnnotator.annotate(self, expression)
    274         col.type = self.schema.get_column_type(source, col)
    275     elif source:
--> 276         col.type = selects[col.table][col.name].type
    277 # Then (possibly) annotate the remaining expressions in the scope
    278 self._maybe_annotate(scope.expression)

KeyError: 'cola'

The lateral view explode is not given a table alias and therefore cola must come from tb so it is not ambiguous. If I change the lateral view explode to this then the query works: lateral view explode(struct_col) table_alias AS struct_col_2

tobymao commented 1 year ago

https://github.com/tobymao/sqlglot/pull/1139

this should have fixed it