malloydata / malloy

Malloy is an experimental language for describing data relationships and transformations.
http://www.malloydata.dev
MIT License
1.99k stars 76 forks source link

Malloy generating invalid SQL (empty grouping set) #1407

Open mtoy-googly-moogly opened 1 year ago

mtoy-googly-moogly commented 1 year ago

Possibly because I asked to do something I shouldn't have because I do not understand things

model and query ...

source: boosterContents is duckdb.table('AllPrintings/setBoosterContents.parquet')
extend {
    join_one: contentWeight is _boosterContentsWeights
        on contentWeight.boosterIndex = boosterIndex
        and contentWeight.boosterName = boosterName
        and contentWeight.setCode = setCode

    view: booster is {
        group_by:
            setCode,
            boosterIndex,
            name is boosterName,
            weight is contentWeight.boosterWeight
        nest: pack is {
            select: sheetName, sheetPicks
        } 
    }
}
source: sets is duckdb.table('AllPrintings/sets.parquet') extend {
    primary_key: code
    join_one: boosterList is boosterContents -> booster on code = boosterList.setCode
}

run: sets -> {
    where: code = 'LTR'
    group_by: code
    nest: boosters is {
        where: boosterList.name = 'default'
        select: boosterList.*
    }
}

Error "Run Query" generates ...

Parser Error: syntax error at or near ")"
LINE 42:   AND ((group_set NOT IN () OR (group_set IN () AND boosterList_0."name"='default')))
  GROUP BY 1,2
)
SELECT
  "code__0" as "code",
  MAX(CASE WHEN group_set=0 THEN "boosters__0" END) as "boosters"
FROM __stage2
GROUP BY 1
ORDER BY 1 asc NULLS LAST
...
                                   ^

SQL for the query ...

Parser Error: syntax error at or near ")"
LINE 42:   AND ((group_set NOT IN () OR (group_set IN () AND boosterList_0."name"='default')))
  GROUP BY 1,2
)
SELECT
  "code__0" as "code",
  MAX(CASE WHEN group_set=0 THEN "boosters__0" END) as "boosters"
FROM __stage2
GROUP BY 1
ORDER BY 1 asc NULLS LAST
...
WITH __stage0 AS (
  SELECT
    group_set,
    boosterContents."setCode" as "setCode__0",
    boosterContents."boosterIndex" as "boosterIndex__0",
    boosterContents."boosterName" as "name__0",
    contentWeight_0."boosterWeight" as "weight__0",
    COALESCE(LIST({
      "sheetName": boosterContents."sheetName", 
      "sheetPicks": boosterContents."sheetPicks"}  ORDER BY  boosterContents."sheetName" asc NULLS LAST) FILTER (WHERE group_set=0),[]) as "pack__0"
  FROM 'AllPrintings/setBoosterContents.parquet' as boosterContents
  LEFT JOIN 'AllPrintings/setBoosterContentWeights.parquet' AS contentWeight_0
    ON ((contentWeight_0."boosterIndex"=boosterContents."boosterIndex")and(contentWeight_0."boosterName"=boosterContents."boosterName"))and(contentWeight_0."setCode"=boosterContents."setCode")
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,0,1)) as group_set  ) as group_set
  GROUP BY 1,2,3,4,5
)
, __stage1 AS (
  SELECT
    "setCode__0" as "setCode",
    "boosterIndex__0" as "boosterIndex",
    "name__0" as "name",
    "weight__0" as "weight",
    MAX(CASE WHEN group_set=0 THEN "pack__0" END) as "pack"
  FROM __stage0
  GROUP BY 1,2,3,4
  ORDER BY 1 asc NULLS LAST
)
, __stage2 AS (
  SELECT
    group_set,
    sets."code" as "code__0",
    COALESCE(LIST({
      "setCode": boosterList_0."setCode", 
      "boosterIndex": boosterList_0."boosterIndex", 
      "name": boosterList_0."name", 
      "weight": boosterList_0."weight"}  ORDER BY  boosterList_0."setCode" asc NULLS LAST) FILTER (WHERE group_set=0),[]) as "boosters__0"
  FROM 'AllPrintings/sets.parquet' as sets
  LEFT JOIN __stage1 AS boosterList_0
    ON sets."code"=boosterList_0."setCode"
  CROSS JOIN (SELECT UNNEST(GENERATE_SERIES(0,0,1)) as group_set  ) as group_set
  WHERE (sets."code"='LTR')
  AND ((group_set NOT IN () OR (group_set IN () AND boosterList_0."name"='default')))
  GROUP BY 1,2
)
SELECT
  "code__0" as "code",
  MAX(CASE WHEN group_set=0 THEN "boosters__0" END) as "boosters"
FROM __stage2
GROUP BY 1
ORDER BY 1 asc NULLS LAST
gmaden commented 1 year ago

I think this is the same as https://github.com/malloydata/malloy/issues/663 for more context as well.