apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.03k stars 14.28k forks source link

setting parameters in SQLExecuteQueryOperator fails (should be the same as using params) #42344

Closed schweizersta closed 1 month ago

schweizersta commented 1 month ago

Apache Airflow Provider(s)

standard

Versions of Apache Airflow Providers

apache-airflow-providers-postgres==5.12.0

Apache Airflow version

2.10.1

Operating System

RHEL 9.4

Deployment

Virtualenv installation

Deployment details

Just default airflow with apache-airflow-providers-postgres

What happened

When using "parameters" to provide params to SQLExecuteQueryOperator, they are not available in the template. An exception is thrown: "jinja2.exceptions.UndefinedError: 'dict object' has no attribute '....'"

Only when given as "params" it works.

What you think should happen instead

Both "parameters" and "params" should be usable, but currently only "params" works.

From the docu:
SQLExecuteQueryOperator provides parameters attribute which makes it possible to dynamically inject values into your SQL requests during runtime. The BaseOperator class has the params attribute which is available to the SQLExecuteQueryOperator by virtue of inheritance. Both parameters and params make it possible to dynamically pass in parameters in many interesting ways.

How to reproduce

Define a task using SQLExecuteQueryOperator and provide params using "parameters" like here:

    sql_task = SQLExecuteQueryOperator(
        task_id="task_id",
        conn_id="conn_id",
        sql="file.sql",
        parameters={'param_01':param_01_value},
    )

And the file.sql may look like that:

select * from table where column = '{{ params.param_01 }}'

Once running the DAG an error is thrown. See stacktrace below:

Traceback (most recent call last):
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/models/abstractoperator.py", line 770, in _do_render_template_fields
    rendered_content = self.render_template(
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/template/templater.py", line 171, in render_template
    return self._render(template, context)
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/models/abstractoperator.py", line 725, in _render
    return super()._render(template, context, dag=dag)
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/template/templater.py", line 127, in _render
    return render_template_to_string(template, context)
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/utils/helpers.py", line 301, in render_template_to_string
    return render_template(template, cast(MutableMapping[str, Any], context), native=False)
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/airflow/utils/helpers.py", line 296, in render_template
    return "".join(nodes)
  File "<template>", line 13, in root
  File "/home/user/apps/conf/venv_airflow/lib64/python3.9/site-packages/jinja2/runtime.py", line 857, in _fail_with_undefined_error
    raise self._undefined_exception(self._undefined_message)
jinja2.exceptions.UndefinedError: 'dict object' has no attribute 'param_01'

Anything else

The problem occurs every time

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 1 month ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

jscheffl commented 1 month ago

The {{ params.*** }} you are referring to are the DAG run parameters which can be defined on DAG level as described in https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/params.html#referencing-params-in-a-task

The parameters you are trying to add from the operators need to be passed in form %(parameter_name)s into the query string - as documented in https://airflow.apache.org/docs/apache-airflow-providers-postgres/stable/operators/postgres_operator_howto_guide.html#passing-server-configuration-parameters-into-postgresoperator

Your SQL file need to look like:

select * from table where %(param_01)s
schweizersta commented 1 month ago

I am not talking about DAG run parameters, but task parameters (given when instantiating SQLExecuteQueryOperator). With some more investigations I figured out how it works and what I am trying to say:

In the previous section "Passing Parameters into SQLExecuteQueryOperator for Postgres" of the second docu link you referred to, there are 2 examples:

1.

  ...
    sql="SELECT * FROM pet WHERE birth_date BETWEEN SYMMETRIC %(begin_date)s AND %(end_date)s",
    parameters={"begin_date": "2020-01-01", "end_date": "2020-12-31"},
  ...

2.

  ...
    sql="sql/birth_date.sql",
    params={"begin_date": "2020-01-01", "end_date": "2020-12-31"},
  ...

In the second example (where the sql command is stored in a file), one has to provide values for the sql parameters using params=. But it does not matter how you access the paramters in the sql command, whether by using %(begin_date) or {{ params.begin_date }}, both of them work fine.

What got me confused, is the statement at the beginning of this section, where it says: "Both parameters and params make it possible to dynamically pass in parameters in many interesting ways.". So I thougt, I can use either params or parameters in both cases. But that "parameter" is only valid for the first case and "params" only for the second one, was not obvious to me.

Now that I know how it's meant to be, I leave it up to you, if you want to clarify this more in the docu or not.

jscheffl commented 1 month ago

Thanks for the feedback! As the community is always happy to receive not only feedback but also direct corrections... would it be possible to open a small PR and propose a change? If you just do this, then I can review and merge it. Just use the button on the bottom of each page:

image

It is really simple and once you have made your first PR you are joining the crowd and be proud! Might take <5min.

kunaljubce commented 1 month ago

@jscheffl I have raised the PR to make this documentation change. Please review.