astronomer / astronomer-cosmos

Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
https://astronomer.github.io/astronomer-cosmos/
Apache License 2.0
607 stars 153 forks source link

[Bug] Render source nodes #1229

Open was-av opened 20 hours ago

was-av commented 20 hours ago

Astronomer Cosmos Version

Other Astronomer Cosmos version (please specify below)

If "Other Astronomer Cosmos version" selected, which one?

1.6.0

dbt-core version

1.3.2

Versions of dbt adapters

dbt-clickhouse==1.3.3 dbt-core==1.3.2 dbt-extractor==0.4.1

LoadMode

AUTOMATIC

ExecutionMode

LOCAL

InvocationMode

DBT_RUNNER

airflow version

2.10.1

Operating System

"Debian GNU/Linux 11 (bullseye)

If a you think it's an UI issue, what browsers are you seeing the problem on?

No response

Deployment

Official Apache Airflow Helm Chart

Deployment details

No response

What happened?

I add sources to Airflow DAG by selecting source_rendering_behavior equal to SourceRenderingBehavior.ALL and get Error described below.

Relevant log output

[2024-09-27T09:53:31.951+0000] {graph.py:136} INFO - Running command: `/opt/airflow/dbt_venv/bin/dbt ls --output json --output-keys name unique_id resource_type depends_on original_file_path tags config freshness --project-dir /tmp/tmp4fja0ww3 --profiles-dir /workspace/transform/clickhouse_dbt --profile clickhouse_dbt --target prod --vars {"logical_date": "{{ ds }}"} --selector daily`
Traceback (most recent call last):
  File "/workspace/dags/dbt_cosmos.py", line 124, in <module>
    globals()[dag_id] = build_dbt_dag(dag_id, config)
  File "/workspace/dags/dbt_cosmos.py", line 106, in build_dbt_dag
    return dbt_cosmos()
  File "/usr/local/lib/python3.10/site-packages/airflow/models/dag.py", line 4307, in factory
    f(**f_kwargs)
  File "/workspace/dags/dbt_cosmos.py", line 85, in dbt_cosmos
    dbt_run_and_test = DbtTaskGroup(
  File "/usr/local/lib/python3.10/site-packages/cosmos/airflow/task_group.py", line 28, in __init__
    DbtToAirflowConverter.__init__(self, *args, **specific_kwargs(**kwargs))
  File "/usr/local/lib/python3.10/site-packages/cosmos/converter.py", line 261, in __init__
    self.dbt_graph.load(method=render_config.load_method, execution_mode=execution_config.execution_mode)
  File "/usr/local/lib/python3.10/site-packages/cosmos/dbt/graph.py", line 402, in load
    self.load_via_dbt_ls()
  File "/usr/local/lib/python3.10/site-packages/cosmos/dbt/graph.py", line 461, in load_via_dbt_ls
    self.load_via_dbt_ls_without_cache()
  File "/usr/local/lib/python3.10/site-packages/cosmos/dbt/graph.py", line 581, in load_via_dbt_ls_without_cache
    nodes = self.run_dbt_ls(dbt_cmd, self.project_path, tmpdir_path, env)
  File "/usr/local/lib/python3.10/site-packages/cosmos/dbt/graph.py", line 442, in run_dbt_ls
    stdout = run_command(ls_command, tmp_dir, env_vars)
  File "/usr/local/lib/python3.10/site-packages/cosmos/dbt/graph.py", line 156, in run_command
    raise CosmosLoadDbtException(f"Unable to run {command} due to the error:\n{details}")
cosmos.dbt.graph.CosmosLoadDbtException: Unable to run ['/opt/airflow/dbt_venv/bin/dbt', 'ls', '--output', 'json', '--output-keys', 'name', 'unique_id', 'resource_type', 'depends_on', 'original_file_path', 'tags', 'config', 'freshness', '--project-dir', '/tmp/tmp4fja0ww3', '--profiles-dir', '/workspace/transform/clickhouse_dbt', '--profile', 'clickhouse_dbt', '--target', 'prod', '--vars', '{"logical_date": "{{ ds }}"}', '--selector', 'daily'] due to the error:
usage: dbt [-h] [--version] [-r RECORD_TIMING_INFO] [-d]
           [--log-format {text,json,default}] [--no-write-json]
           [--use-colors | --no-use-colors] [--printer-width PRINTER_WIDTH]
           [--warn-error] [--no-version-check]
           [--partial-parse | --no-partial-parse] [--use-experimental-parser]
           [--no-static-parser] [--profiles-dir PROFILES_DIR]
           [--no-anonymous-usage-stats] [-x]
           [--event-buffer-size EVENT_BUFFER_SIZE] [-q] [--no-print]
           [--cache-selected-only | --no-cache-selected-only]
           {docs,source,init,clean,debug,deps,list,ls,build,snapshot,run,compile,parse,test,seed,run-operation}
           ...
dbt: error: unrecognized arguments: unique_id resource_type depends_on original_file_path tags config freshness

How to reproduce

  1. Create dbt project with sources
  2. Create airflow dag with DbtTaskGroup
  3. Use RenderConfig with parameter source_rendering_behavior=SourceRenderingBehavior.ALL
  4. Run python file with that dag

Anything else :)?

Error related to code in file cosmos/dbt/graph.py

    def run_dbt_ls(
        self, dbt_cmd: str, project_path: Path, tmp_dir: Path, env_vars: dict[str, str]
    ) -> dict[str, DbtNode]:
        """Runs dbt ls command and returns the parsed nodes."""
        if self.render_config.source_rendering_behavior != SourceRenderingBehavior.NONE:
            ls_command = [
                dbt_cmd,
                "ls",
                "--output",
                "json",
                "--output-keys",
                "name",
                "unique_id",
                "resource_type",
                "depends_on",
                "original_file_path",
                "tags",
                "config",
                "freshness",
            ]
        else:
            ls_command = [dbt_cmd, "ls", "--output", "json"]

Options value of --output-keys must added as one string.

Are you willing to submit PR?

Contact Details

wasrull@gmal.com

dosubot[bot] commented 20 hours ago

Hey @was-av! 👋 I'm here to help you with bug fixes, answer your questions, and guide you on becoming a contributor.

The error you're encountering is due to the --output-keys argument in the dbt ls command being passed as separate strings instead of a single string. This causes dbt to misinterpret the arguments.

To fix this, you need to modify the run_dbt_ls function in cosmos/dbt/graph.py to concatenate the --output-keys values into a single string. Here is the corrected code snippet:

def run_dbt_ls(
    self, dbt_cmd: str, project_path: Path, tmp_dir: Path, env_vars: dict[str, str]
) -> dict[str, DbtNode]:
    """Runs dbt ls command and returns the parsed nodes."""
    if self.render_config.source_rendering_behavior != SourceRenderingBehavior.NONE:
        ls_command = [
            dbt_cmd,
            "ls",
            "--output",
            "json",
            "--output-keys",
            "name unique_id resource_type depends_on original_file_path tags config freshness",
        ]
    else:
        ls_command = [dbt_cmd, "ls", "--output", "json"]

This change ensures that the --output-keys argument is correctly formatted as a single string, which should resolve the error you're seeing [1].

To continue talking to Dosu, mention @dosu.

pankajastro commented 20 hours ago

Hey @was-av, The source node rendering feature is available only for dbt-core >= 1.5 https://astronomer.github.io/astronomer-cosmos/configuration/source-nodes-rendering.html

was-av commented 19 hours ago

@pankajastro thanks for your quick answer. Could you explain why the source node rendering feature required dbt-core min version 1.5?