duckdb / dbt-duckdb

dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
Apache License 2.0
935 stars 89 forks source link

Support create directories at write time #337

Open dchimeno opened 9 months ago

dchimeno commented 9 months ago

https://github.com/duckdb/duckdb/discussions/10665

I've created this discussion at duckdb repo, probably belong to duckdb, but not sure if should/could be treated here before.

The general idea what I would like:, with a model config like this:

{{ config(
    materialized='external', 
    format='parquet',
    incremental_strategy = 'insert_overwrite',
    location="{{ env_var('DATALAKE_PATH') }}/a/b/c/whatever.parquet"
    ) }}

I would like to this work either with s3 urls or local.

DATALAKE_PATH="data" DATALAKE_PATH="s3://a-bucket"

It's working with s3, but not with local filesystem raising an error like:

  IO Error: No files found that match the pattern "/data/a/b/c/whatever.parquet"

because a, b, or c doesnt exist.

jwills commented 9 months ago

yeah whenever duckdb supports this option it's easy to incorporate it into dbt-duckdb's external materializations via the options dictionary argument; I can't think of a good way to hack this into dbt-duckdb itself tho, it's beyond the scope of jinja to do this sort of thing in like a macro