duckdb / dbt-duckdb

dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
Apache License 2.0
790 stars 70 forks source link

Cannot create Map type column in SELECT statement #320

Closed sogunsemi closed 5 months ago

sogunsemi commented 5 months ago

I'm trying to replicate a transformation I have in my dbt-databricks setup. The data I'm reading is from a parquet file:

SELECT
  date,
  col1,
  col2,
  MAP(
      'col1', col1,
      'col2', col2
  ) AS mapped
FROM table1

Here is what I'm trying with dbt-duckdb:

SELECT
  date,
  col1,
  col2,
  map{
    'col1': col1,
    'col2': col2
  } AS mapped
FROM table1

This gives me an error that seems to be implying I can't create a map of (VARCHAR, BIGINT). I ran the query in the duckDB CLI and it worked just fine. Could there be an issue with how the dbt-duckdb library handles the map type? Here is the error:

15:25:27  Runtime error in model example_stg (models/staging/example_stg.sql) 
Binder Error: No function matches the given name and argument types '+(VARCHAR, BIGINT)'. You might need to add explicit type casts.
  Candidate functions:
  +(TINYINT) -> TINYINT
  +(TINYINT, TINYINT) -> TINYINT
  +(SMALLINT) -> SMALLINT
  +(SMALLINT, SMALLINT) -> SMALLINT
  +(INTEGER) -> INTEGER
  +(INTEGER, INTEGER) -> INTEGER
  +(BIGINT) -> BIGINT
  +(BIGINT, BIGINT) -> BIGINT
  +(HUGEINT) -> HUGEINT
  +(HUGEINT, HUGEINT) -> HUGEINT
  +(FLOAT) -> FLOAT
  +(FLOAT, FLOAT) -> FLOAT
  +(DOUBLE) -> DOUBLE
  +(DOUBLE, DOUBLE) -> DOUBLE
  +(DECIMAL) -> DECIMAL
  +(DECIMAL, DECIMAL) -> DECIMAL
  +(UTINYINT) -> UTINYINT
  +(UTINYINT, UTINYINT) -> UTINYINT
  +(USMALLINT) -> USMALLINT
  +(USMALLINT, USMALLINT) -> USMALLINT
  +(UINTEGER) -> UINTEGER
  +(UINTEGER, UINTEGER) -> UINTEGER
  +(UBIGINT) -> UBIGINT
  +(UBIGINT, UBIGINT) -> UBIGINT
  +(DATE, INTEGER) -> DATE
  +(INTEGER, DATE) -> DATE
  +(INTERVAL, INTERVAL) -> INTERVAL
  +(DATE, INTERVAL) -> DATE
  +(INTERVAL, DATE) -> DATE
  +(TIME, INTERVAL) -> TIME
  +(INTERVAL, TIME) -> TIME
  +(TIMESTAMP, INTERVAL) -> TIMESTAMP
  +(INTERVAL, TIMESTAMP) -> TIMESTAMP
  +(TIME, DATE) -> TIMESTAMP
  +(DATE, TIME) -> TIMESTAMP
  +(ANY[], ANY[]) -> ANY[]
  +(TIMESTAMP WITH TIME ZONE, INTERVAL) -> TIMESTAMP WITH TIME ZONE
  +(INTERVAL, TIMESTAMP WITH TIME ZONE) -> TIMESTAMP WITH TIME ZONE
jwills commented 5 months ago

The error makes it seem like the problem is happening because you're trying to add a varchar and a bigint, but I don't see where that's happening in the query you posted (i.e., there's no + operator.) So I'm a bit puzzled-- there's nowhere in the example_stg model where you're adding something?

sogunsemi commented 5 months ago

The error makes it seem like the problem is happening because you're trying to add a varchar and a bigint, but I don't see where that's happening in the query you posted (i.e., there's no + operator.) So I'm a bit puzzled-- there's nowhere in the example_stg model where you're adding something?

So in that example table1 is a model that contains the columns col1 and col2 which are both of type bigint. I'm attempting to combine those 2 columns into a map. The goal is to create a new column called mapped of type map(varchar, bigint).

jwills commented 5 months ago

Yeah I understand the goal-- I'm confused b/c the error is about the addition operator: '+(VARCHAR, BIGINT)' and yet I don't see any + signs in the query you posted that was throwing that error; did you post the whole query, or simplify it somewhat to remove some stuff that would have possibly included a + sign?

sogunsemi commented 5 months ago

Yeah I understand the goal-- I'm confused b/c the error is about the addition operator: '+(VARCHAR, BIGINT)' and yet I don't see any + signs in the query you posted that was throwing that error; did you post the whole query, or simplify it somewhat to remove some stuff that would have possibly included a + sign?

Ah right sorry, I hadn't posted the actual query, just an example of it. I thought the error had to do with the map function but turns out it does not.

The real issue was that in another model that was referenced in example_stg, a column that should have been a bigint was created as a string. This string column was then added to another bigint column (col1 + col2) which caused this error. I didn't understand that's what the error was talking about.

This was a user error, I'll close this, thanks!