Closed aaronsteers closed 1 year ago
Yeah, my understanding (which I would be grateful if you would confirm for me) is that you can treat everything in one of dbt's .yml
files as if it was jinja-fied as well, including meta tags on sources. So in your example, something like:
meta:
external_location: "{{ \"read_csv_auto('./jaffle-data/{name}.csv', header=1)\" if some_value is not none }}"
...should return an empty string as the value of the external_location
property if some_value
is None
. If the value of the external_location
is an empty string, then this condition will evaluate to false, and we would fall back to using the standard dbt strategy for rendering the source table as a relation in the model.
(Note the fact that any field in a sources.yml
file may be treated as a jinja template is exactly why we use f-string style formatting for the name
and identifier
in external sources.)
Great, thanks @jwills !
I'll close for now and reopen if I run into any issue.
Also, will post back here when I see confirmation either way. 👍
@jwills - Circling back.
These both work:
external_location: null # Works!
external_location: '' # Works!
However, when calculating with jinja, I could't get null working.
I ultimately went with returning an empty string.
sources:
- name: ecom
schema: "{{ env_var('JAFFLE_RAW_SCHEMA', 'jaffle_raw') }}"
description: E-commerce data
meta:
# If `$JAFFLE_RAW_SCHEMA` is specified, use the provided raw data. Otherwise, use the csv seed data from the repo.
external_location: >-
{{ '' if env_var('JAFFLE_RAW_SCHEMA', '') else 'read_csv_auto("./jaffle-data/{name}.csv", header=1)' }}
This seems to work perfectly. 🙌 Thanks again!
(Nitty gritty details: I used the >-
yaml operator which (1) let's me put the string on the next line for readability, (2) reduces string escaping I need to do, and (3) doesn't add any preceeding or post-fixed newlines.)
@aaronsteers that is good to here, I'll add a section about this situation (which I expect to be more common in the future) when I update the docs this week.
Wanted to clarify one thing tho: does this not work?
sources:
- name: ecom
schema: "{{ env_var('JAFFLE_RAW_SCHEMA', 'jaffle_raw') }}"
description: E-commerce data
meta:
# If `$JAFFLE_RAW_SCHEMA` is specified, use the provided raw data. Otherwise, use the csv seed data from the repo.
external_location: >-
{{ 'read_csv_auto("./jaffle-data/{name}.csv", header=1)' if not env_var('JAFFLE_RAW_SCHEMA', '') }}
(totally understand the way you structured the check to maximize clarity, I just want to be sure if my understanding re the {{ value if cond }}
construct is correct)
Wanted to clarify one thing tho: does this not work?
sources: - name: ecom schema: "{{ env_var('JAFFLE_RAW_SCHEMA', 'jaffle_raw') }}" description: E-commerce data meta: # If `$JAFFLE_RAW_SCHEMA` is specified, use the provided raw data. Otherwise, use the csv seed data from the repo. external_location: >- {{ 'read_csv_auto("./jaffle-data/{name}.csv", header=1)' if not env_var('JAFFLE_RAW_SCHEMA', '')
Unfortunately not: TypeError: can not serialize 'Undefined' object
Any attempt to return None
or null
seems to give me this same error.
This appears to be an issue with how dbt handles nulls when returned from jinja.
You can repro with:
sources:
- name: ecom
schema: "{{ env_var('JAFFLE_RAW_SCHEMA', 'jaffle_raw') }}"
meta:
external_location: "{{ 'not-used' if false }}"
Ah, fascinating-- thank you sir!
I'm working on a PR where we may want to have the option to either import raw data or load from seed files via the
external_location
annotation.Is it possible to return a value in
external_location
that nullifies the expression?Something like the psuedocode:
Where in this case
null
would be intepreted as no instruction.The alternative is to ask users to comment and uncomment the value, e.g.:
https://github.com/MeltanoLabs/jaffle-shop-template/pull/1/files#diff-6c181d0ccc3c5e37e2feaacdd8872e09073b138835156160dcb5ee7c159f8c63L3-R9
Can you say if
null
today would cause an error, or just a no-op relative to theexternal_location
operator?