duckdb / dbt-duckdb

dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
Apache License 2.0
852 stars 77 forks source link

External Location: f-string value needs more than one variable #127

Closed armsepehr closed 1 year ago

armsepehr commented 1 year ago

I am interested in using external_location f-string with several variable but only name, indicator and schema are known in the f-string.

sources:
  - name: external_source
    meta:
      external_location: "s3://my-bucket/{folder2}/{meta.name}.{meta.suffix}"
    tables:
      - name: source1
         meta:
             suffix: parquet
             folder: folder1
      - name: source2
        meta:
              suffix: csv
              folder: folder2

Additionally, I don't want to overwrite the external_location for each table name. In this regard, I suggest to improve the f-string in real format name such as below:

- name: external_source
    meta:
      external_location: "s3://my-bucket/{source.meta['folder2']}/{source.meta['name']}.{source.meta['suffix'}"
    tables:
      - name: source1
         meta:
             suffix: parquet
             folder: folder1
      - name: source2
        meta:
              suffix: csv
              folder: folder2

In this case, the indicator should be addressed the source.indicator. I can do it myself, but please clarify that do you have any suggestion to improve this? I have identified the problem in duckdb/relation.py and send the PR to you, but as far as I'm not sure that you're agree with change I didn't modify the README. Additionally, the test should be improved in this regard, so I can do it myself after your suggestion.

jwills commented 1 year ago

Ah, nice-- this is a good idea.

Why not just embed the contents of the meta dictionary in the formatting context so you could do e.g. s3://my-bucket/{folder2}/{name}.{suffix} w/o the extra boilerplate? Are you worried about a naming conflict?

armsepehr commented 1 year ago

I think it's better to distinguish between the variable inside and outside the meta variable; however, this is a good idea to simplify the variable inside the f-string compare to directly referencing such assource.meta["folder2]. What you said has another advantage and the previous test already passed to reference name, schema and identifier. Please clarify your suggestion according to address a variable inside the f-string.

jwills commented 1 year ago

I was thinking like this (the test illustrates how it would look): https://github.com/jwills/dbt-duckdb/pull/130

armsepehr commented 1 year ago

Understood. It works as far as we don't have naming conflict. In the other side, I think what you said is more convenient and easy to reference the variable. Thanks.