dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.82k stars 1.62k forks source link

[CT-1215] [Bug] support indirect strings in `ref` and `source` for Python models #5887

Closed lostmygithubaccount closed 1 year ago

lostmygithubaccount commented 2 years ago

Is this a new bug in dbt-core?

Current Behavior

EDIT: see comment below. this does not fail due to iterables/lists/dictionaries, but rather us statically analyzing the code and being unable to indirectly use strings via variables. thus an even simpler reproduction:

def model(dbt, session):

    model_name = "orders"

    return dbt.ref(model_name)

Take debug1.py and debug2.py, built off of jaffle_shop, as simple examples:

def model(dbt, session):

    model_names = ["orders", "customers"]
    models = []

    for model_name in model_names:
        models.extend(dbt.ref(model_name))

    return models[0]
def model(dbt, session):

    models = {
        "orders": None,
        "customers": None
    } 

    for model_name in models:
        models[model_name] = dbt.ref(model_name)

    return models["orders"]

I would expect both of these examples to work, but they fail with parsing error below

Expected Behavior

above works

Steps To Reproduce

gh repo clone dbt-labs/jaffle_shop
cd jaffle_shop
dbt seed
dbt run
code debug1.py debug2.py

copy code above into those files

dbt run -s debug1
dbt run -s debug2

Relevant log output

@lostmygithubaccount ➜ .../learn-dbt-py/demos/jaffle_shops/snowpark (main ✗) $ dbt run -s debug1
16:32:45  Running with dbt=1.3.0-b2
16:32:45  Unable to do partial parsing because profile has changed
16:32:46  Encountered an error:
Parsing Error in model debug1 (models/debug1.py)
  malformed node or string on line 8: <ast.Name object at 0x7f262340ba60>
16:32:46  Traceback (most recent call last):
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 88, in _safe_eval
    return ast.literal_eval(node)
  File "/usr/local/lib/python3.10/ast.py", line 108, in literal_eval
    return _convert(node_or_string)
  File "/usr/local/lib/python3.10/ast.py", line 107, in _convert
    return _convert_signed_num(node)
  File "/usr/local/lib/python3.10/ast.py", line 81, in _convert_signed_num
    return _convert_num(node)
  File "/usr/local/lib/python3.10/ast.py", line 72, in _convert_num
    _raise_malformed_node(node)
  File "/usr/local/lib/python3.10/ast.py", line 69, in _raise_malformed_node
    raise ValueError(msg + f': {node!r}')
ValueError: malformed node or string on line 8: <ast.Name object at 0x7f262340ba60>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/codyspace/dbt-core/core/dbt/main.py", line 129, in main
    results, succeeded = handle_and_check(args)
  File "/workspaces/codyspace/dbt-core/core/dbt/main.py", line 191, in handle_and_check
    task, res = run_from_args(parsed)
  File "/workspaces/codyspace/dbt-core/core/dbt/main.py", line 238, in run_from_args
    results = task.run()
  File "/workspaces/codyspace/dbt-core/core/dbt/task/runnable.py", line 453, in run
    self._runtime_initialize()
  File "/workspaces/codyspace/dbt-core/core/dbt/task/runnable.py", line 161, in _runtime_initialize
    super()._runtime_initialize()
  File "/workspaces/codyspace/dbt-core/core/dbt/task/runnable.py", line 94, in _runtime_initialize
    self.load_manifest()
  File "/workspaces/codyspace/dbt-core/core/dbt/task/runnable.py", line 81, in load_manifest
    self.manifest = ManifestLoader.get_full_manifest(self.config)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/manifest.py", line 220, in get_full_manifest
    manifest = loader.load()
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/manifest.py", line 350, in load
    self.parse_project(
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/manifest.py", line 475, in parse_project
    parser.parse_file(block)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/base.py", line 414, in parse_file
    self.parse_node(file_block)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/base.py", line 388, in parse_node
    self.render_update(node, config)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 226, in render_update
    self.parse_python_model(node, config, context)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 207, in parse_python_model
    dbtParser.visit(tree)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/usr/local/lib/python3.10/ast.py", line 418, in generic_visit
    self.visit(item)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/usr/local/lib/python3.10/ast.py", line 418, in generic_visit
    self.visit(item)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/usr/local/lib/python3.10/ast.py", line 418, in generic_visit
    self.visit(item)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/usr/local/lib/python3.10/ast.py", line 420, in generic_visit
    self.visit(value)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 129, in visit_Call
    self.visit_Call(obj)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 122, in visit_Call
    args, kwargs = self._get_call_literals(node)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 104, in _get_call_literals
    rendered = self._safe_eval(arg)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 91, in _safe_eval
    raise ParsingException(msg, node=self.dbt_node) from exc
dbt.exceptions.ParsingException: Parsing Error in model debug1 (models/debug1.py)
  malformed node or string on line 8: <ast.Name object at 0x7f262340ba60>
@lostmygithubaccount ➜ .../learn-dbt-py/demos/jaffle_shops/snowpark (main ✗) $ dbt run -s debug2
16:35:55  Running with dbt=1.3.0-b2
16:35:55  Unable to do partial parsing because profile has changed
16:35:56  Encountered an error:
Parsing Error in model debug2 (models/debug2.py)
  malformed node or string on line 9: <ast.Name object at 0x7f2f0ce29780>
16:35:56  Traceback (most recent call last):
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 88, in _safe_eval
    return ast.literal_eval(node)
  File "/usr/local/lib/python3.10/ast.py", line 108, in literal_eval
    return _convert(node_or_string)
  File "/usr/local/lib/python3.10/ast.py", line 107, in _convert
    return _convert_signed_num(node)
  File "/usr/local/lib/python3.10/ast.py", line 81, in _convert_signed_num
    return _convert_num(node)
  File "/usr/local/lib/python3.10/ast.py", line 72, in _convert_num
    _raise_malformed_node(node)
  File "/usr/local/lib/python3.10/ast.py", line 69, in _raise_malformed_node
    raise ValueError(msg + f': {node!r}')
ValueError: malformed node or string on line 9: <ast.Name object at 0x7f2f0ce29780>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspaces/codyspace/dbt-core/core/dbt/main.py", line 129, in main
    results, succeeded = handle_and_check(args)
  File "/workspaces/codyspace/dbt-core/core/dbt/main.py", line 191, in handle_and_check
    task, res = run_from_args(parsed)
  File "/workspaces/codyspace/dbt-core/core/dbt/main.py", line 238, in run_from_args
    results = task.run()
  File "/workspaces/codyspace/dbt-core/core/dbt/task/runnable.py", line 453, in run
    self._runtime_initialize()
  File "/workspaces/codyspace/dbt-core/core/dbt/task/runnable.py", line 161, in _runtime_initialize
    super()._runtime_initialize()
  File "/workspaces/codyspace/dbt-core/core/dbt/task/runnable.py", line 94, in _runtime_initialize
    self.load_manifest()
  File "/workspaces/codyspace/dbt-core/core/dbt/task/runnable.py", line 81, in load_manifest
    self.manifest = ManifestLoader.get_full_manifest(self.config)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/manifest.py", line 220, in get_full_manifest
    manifest = loader.load()
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/manifest.py", line 350, in load
    self.parse_project(
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/manifest.py", line 475, in parse_project
    parser.parse_file(block)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/base.py", line 414, in parse_file
    self.parse_node(file_block)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/base.py", line 388, in parse_node
    self.render_update(node, config)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 226, in render_update
    self.parse_python_model(node, config, context)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 207, in parse_python_model
    dbtParser.visit(tree)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/usr/local/lib/python3.10/ast.py", line 418, in generic_visit
    self.visit(item)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/usr/local/lib/python3.10/ast.py", line 418, in generic_visit
    self.visit(item)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/usr/local/lib/python3.10/ast.py", line 418, in generic_visit
    self.visit(item)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/usr/local/lib/python3.10/ast.py", line 420, in generic_visit
    self.visit(value)
  File "/usr/local/lib/python3.10/ast.py", line 410, in visit
    return visitor(node)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 122, in visit_Call
    args, kwargs = self._get_call_literals(node)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 104, in _get_call_literals
    rendered = self._safe_eval(arg)
  File "/workspaces/codyspace/dbt-core/core/dbt/parser/models.py", line 91, in _safe_eval
    raise ParsingException(msg, node=self.dbt_node) from exc
dbt.exceptions.ParsingException: Parsing Error in model debug2 (models/debug2.py)
  malformed node or string on line 9: <ast.Name object at 0x7f2f0ce29780>

Environment

- OS: linux
- Python: 3.10
- dbt: off of main branch (1.3)

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

lostmygithubaccount commented 2 years ago

this is a bit complicated -- even this would fail:

def model(dbt, session):

    model_name = "orders"

    return dbt.ref(model_name)

intermediate step could be more clear error message

ChenyuLInx commented 2 years ago

We raised better error message to tell folks not do this, but still want to leave this open to revisit and see if other better options we would like to provide in the future

github-actions[bot] commented 1 year ago

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions[bot] commented 1 year ago

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

saviorand commented 1 year ago

Interested in this, is there a plan to support dynamic values in the future?

JamesRusconi commented 6 months ago

Also interested in this

hjhdaniel commented 3 months ago

Hi @ChenyuLInx @lostmygithubaccount , I am currently developing on DBT/Databricks and is facing this issue.

Context: My Databricks catalog is designed in a way where my client companies are all cleanly separated by schemas (schema names are the companies' identifiers and each company has the exact same table structure). This means that I require being able to run DBT source with dynamic source_name arguments, which references my python vars. (so that the DBT run can reference/transform the correct client company's schema in Databricks). 1) I require using python models for complex transformations with recursive calls that is not achievable via Databricks sql DBT models. 2) I require getting/passing dynamic source_name variables as such:

def model(dbt, session):
    dbt.config(materialized="table", unique_key="id")
    source_name = dbt.config.get("DATABRICKS_SCHEMA_NAME")
    source_table = dbt.source(source_name, "source_table_name")

    ...

    return final_table

Question/Request: I would like to understand a little more on why this feature is not supported/encouraged. 1) What is the limitation/risk that I am unaware of here? 2) Is this safeguard something that can maybe be bypassed via some configuration? (e.g. disallow dynamic source_name argument by default but has a flag in DBT project or profile config that allows for it to be bypassed) 3) Are there any other workarounds that I can do to achieve what I need in python models?

Note: I am suggesting this because this is achievable if I use sql models in the following manner, so it feels a little inconsistent in principle to have a difference in treatment between python/sql models (see working sql model example below):

{% set source_name = var('DATABRICKS_SCHEMA_NAME') %}

with source_table as (
    select
        *
    from
        {{ source(source_name, 'source_table_name') }}
),

...

select * from final_table
JoyMurmur commented 3 months ago

Do we have an update on this? This feature can add a lot flexbility to many python data processes. E.g. allowing for f-string input can be great enough. Or is there any workaround currently for Python models?

toransahu commented 1 month ago

Interested in this, is there a plan to support dynamic values in the future?