malloydata / malloy

Malloy is an experimental language for describing data relationships and transformations.
http://www.malloydata.dev
MIT License
1.95k stars 75 forks source link

Unchecked partition_by produces runtime error #1895

Open mtoy-googly-moogly opened 5 days ago

mtoy-googly-moogly commented 5 days ago
run: duckdb.sql("""
  SELECT 'n' as name, 1 as filed, 'val1' as sic
  UNION ALL SELECT 'n', 2, 'val1'
  UNION ALL SELECT 'n', 3, 'val2'
  UNION ALL SELECT 'n', 4, 'val3'
""") -> {
  group_by: sic
  calculate: lag_sic is lag(sic)
  group_by: filed
  calculate: change_in_sic is
    pick 'Initial' when lag(sic) = NULL
    pick 'True' when sic != lag(sic) {partition_by: name order_by: filed}
    else 'No Change'
  order_by: filed
}

Produces a runtime error "Internal Error, field Not defined name"

Change group_by: filed to group_by: filed. name and it works

If I leave filed out of the group_by: I get Unknown field filed in output space from the compiler

  1. We should check the partition_by: like we do the order_by: at compile time
  2. The error message could be more helpful, and it should at least be Unknown field 'filed' in output space
christopherswenson commented 5 days ago

I think what's happening is that it is being checked, but that check just isn't ensuring that it's an output field. It just checks whether it exists: https://github.com/malloydata/malloy/blob/main/packages/malloy/src/lang/ast/expressions/expr-func.ts#L305

So I think we just add a check whether the eval space is 'output' there.