duckdb / dbt-duckdb

dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
Apache License 2.0
853 stars 77 forks source link

Dbt models that are materialized as views result in catalog error (table cannot be found) #67

Open akshaysu12 opened 1 year ago

akshaysu12 commented 1 year ago

Dbt models that are materialized as views are unable to be found:

Catalog Error: Table with name <table_name> does not exist

The sql view model seems to get created without issues:

Concurrency: 1 threads (target='duckdb') 1 of 4 OK created sql view model

Running a select statement on the model returns the missing table error. Wondering if this is a bug/missing feature or if I am missing something in the setup?

I see that there is a duckdb specific create_view_as macro. Does a corresponding view.sql need to be created under the materializations folder?

https://github.com/dbt-labs/dbt-core/blob/dev/louisa-may-alcott/core/dbt/include/global_project/macros/materializations/view/view.sql

jwills commented 1 year ago

hrm, that surprises me, we test that logic pretty thoroughly I just verified it works locally; can you post your duckdb config from profiles.yml and the version of dbt-duckdb you're using?

akshaysu12 commented 1 year ago

After a little more digging you are correct views do work. When I wrap my view in a CTE however it fails. From this thread it looks like duckdb does not support CTEs in views. https://github.com/duckdb/duckdb/issues/2479.

jwills commented 1 year ago

I don't understand- how do you wrap a view in a CTE? Isn't it dbt's job to turn the SELECT query into a view for you? I do stuff like:

WITH stg as (
  SELECT * FROM {{ source('stg', 'table') }}
)
SELECT * FROM stg

all the time and it works fine.

akshaysu12 commented 1 year ago

Yes I'm doing the same:

WITH cte_expr AS (
    SELECT 
        tablename.first_select,
        tablename.second_select
    FROM schemaname.tablename
)

SELECT * FROM cte_expr

When I add the tag: +materialized: table then I have no issues. When I change this to +materialized: view or remove any materialization on the model then I run into the catalog error.

When I remove the CTE and use this:

SELECT 
      tablename.first_select,
      tablename.second_select
FROM schemaname.tablename

I have no issues (materialized as a view or a table).

If you cannot reproduce this locally I will keep trying to see what I'm doing wrong on my end!

here's my config:

profiles.yml

    duckdb:
      type: duckdb
      path: 'gpa.duckdb'
      schema: "dev_{{ env_var('USER') }}"

version: dbt-duckdb = "^1.3.2"

jwills commented 1 year ago

Yeah I can't get it to fail, here's what I'm running with:

An ingest.sql model:

{{ config(materialized='view') }}

WITH input AS (
  select *
  from {{ source('csvs', 'test') }}
)
SELECT * FROM input

...and a profile that matches yours, and I see my data in dev_jwills.ingest defined as a view.

To clarify: the dbt run call succeeds, but when you go to open gpa.duckdb using the DuckDB CLI (or an equivalent python script), the view is simply not present in dev_{{ env_var('USER') }}?

What happens if you add a dbt test on the model in your schemas.yml?

akshaysu12 commented 1 year ago

Sorry I should have been more clear. Yes the dbt run call succeeds. I think the view successfully gets created as well (I can see the view in Dbeaver). When I try to query the view though I run into the catalog error. I'm just running a simple select * from dbtmodelname. Is that an incorrect way of querying a view in duckdb?

I've tried from dbeaver, python client and a dbt test all with the same results.

jwills commented 1 year ago

oh interesting-- I always query SELECT * FROM <schema>.<relation>; maybe that's the difference? And you're saying for some reason SELECT * FROM <relation> works correctly unless the view in question was defined with a CTE?

jwills commented 1 year ago

(For relations that aren't defined in the main schema, which is the default for DuckDB)

akshaysu12 commented 1 year ago

sorry again I am not being specific enough. I am specifying the schema name:

materialized as view

>>> con.execute("SELECT * FROM dev_asubramanian.tablename")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
duckdb.CatalogException: Catalog Error: Table with name tablename does not exist!
Did you mean "dev_asubramanian_1669766215.tablename"?

materialized as table

>>> con.execute("SELECT * FROM dev_asubramanian.tablename")
<duckdb.DuckDBPyConnection object at 0x1243cfcb0>
jwills commented 1 year ago

hrm, and I'm assuming "dev_asubramanian_1669766215" isn't a schema you created?

akshaysu12 commented 1 year ago

It is a schema I created (unique schema for testing). But it has the same issue.

### >>> con.execute("SELECT * FROM dev_asubramanian_1669766215.tablename")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
duckdb.CatalogException: Catalog Error: Table with name tablename does not exist!
Did you mean "dev_asubramanian_1669766215.tablename"?
jwills commented 1 year ago

well that is really weird. I almost wonder if there is like a character encoding issue somewhere-- for whatever reason, DuckDB doesn't seem to believe that the name of that view in the DB is the same as the name of the view you are querying for. 🤔

akshaysu12 commented 1 year ago

I am seeing issues when I try to open up the db with the duckdb cli:

./duckdb -init $HOME/analytics/dbt/gpa.duckdb

Error: near line 1: Parser Error: syntax error at or near ""
LINE 1: ??c????DUCK'?+????????
        ^
Error: near line 92: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 104: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 112: Parser Error: syntax error at or near "storder"
LINE 1: storder 16697614321432
        ^
Error: near line 123: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 134: Parser Error: syntax error at or near "der"
LINE 1: der669688817
        ^
Error: near line 145: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...
        ^
Error: near line 493: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 505: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 513: Parser Error: syntax error at or near "storder"
LINE 1: storder 16697614321432
        ^
Error: near line 524: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 535: Parser Error: syntax error at or near "der"
LINE 1: der669688817
        ^
Error: near line 546: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...
jwills commented 1 year ago

I think it’s having issues replaying the WAL which again might be an encoding-related problem

Mause commented 1 year ago

I am seeing issues when I try to open up the db with the duckdb cli:

./duckdb -init $HOME/analytics/dbt/gpa.duckdb

Error: near line 1: Parser Error: syntax error at or near ""
LINE 1: ??c????DUCK'?+????????
        ^
Error: near line 92: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 104: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 112: Parser Error: syntax error at or near "storder"
LINE 1: storder   16697614321432
        ^
Error: near line 123: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 134: Parser Error: syntax error at or near "der"
LINE 1: der669688817
        ^
Error: near line 145: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...
        ^
Error: near line 493: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 505: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 513: Parser Error: syntax error at or near "storder"
LINE 1: storder   16697614321432
        ^
Error: near line 524: Parser Error: syntax error at or near "der"
LINE 1: der
        ^
Error: near line 535: Parser Error: syntax error at or near "der"
LINE 1: der669688817
        ^
Error: near line 546: Parser Error: syntax error at or near "der"
LINE 1: der?^????^?? ?^?????^???^??^??...

Is there a reason you're passing a duckdb database to a parameter that expects an SQL file?

akshaysu12 commented 1 year ago

just ignorance. How do I initialize an in-memory db from an existing duckdb file?

Lyonsclay commented 1 year ago

@akshaysu12 In your profiles.yml you can specify existing database in the target field with the relative path.

example:
  target: dev
  outputs:
    dev:
      type: duckdb
      path: target/example.duckdb
Lyonsclay commented 1 year ago

Actually maybe you are asking how to do that on the cli. In that case you wouldn't need the --init flag. ./duckdb $HOME/analytics/dbt/gpa.duckdb

jwills commented 1 year ago

He's asking about the cli tho, which is just duckdb /path/to/the/existing/file.db

Thank you @Mause for spotting the -init arg that I missed!

akshaysu12 commented 1 year ago

oh that is much simpler, thank you!