apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.47k stars 14.37k forks source link

Jinja NativeEnvironment causing unexpected argument type change #34641

Open CptTZ opened 1 year ago

CptTZ commented 1 year ago

Apache Airflow version

2.7.1

What happened

I have a PythonOperator using a dictionary as op_args. When I enable render_template_as_native_obj, a string value (but only with numeric chars, i.e. "life": "42") in the dict will be converted to int.

I did more researches and it seems the root cause is with py as template_fields_renderers. I didn't look further into Airflow implementation but I can help with debugging.

What you think should happen instead

No field type conversion should ever happen.

How to reproduce

DAG file:

from datetime import datetime

from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow.operators.smooth import SmoothOperator

BUGGY_DICT = {
    "t1": 123,
    "t4": "GOOD STR",
    "life": "42",
}

def a_py_fun(a):
    print(f"XX: {a}")
    print(f"XX type: {type(a['life'])}")

with DAG(
    dag_id='test',
    schedule=None,
    start_date=datetime(2023, 1, 1),
    catchup=False,
    render_template_as_native_obj=True, # Or False
) as dag:
    d = PythonOperator(
        task_id="py_bug",
        python_callable=a_py_fun,
        op_args=[BUGGY_DICT],
    )

    d >> SmoothOperator(task_id="xxx")

if __name__ == "__main__":
    dag.test()

When render_template_as_native_obj is True:

...
XX: {'t1': 123, 't4': 'GOOD STR', 'life': 42}
XX type: <class 'int'>
...

When render_template_as_native_obj is False:

...
XX: {'t1': 123, 't4': 'GOOD STR', 'life': '42'}
XX type: <class 'str'>
...

Operating System

Likely OS agnostic, reproducible on Ubuntu and macOS

Versions of Apache Airflow Providers

N/A

Deployment

Virtualenv installation

Deployment details

pip install apache-airflow==2.7.1

Anything else

No response

Are you willing to submit PR?

Code of Conduct

boring-cyborg[bot] commented 1 year ago

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

Taragolis commented 1 year ago

Unfortunetly that is how Native Environment works

import sys
from jinja2.nativetypes import NativeEnvironment

environment = NativeEnvironment()

jinja_context = {
    "var_int": 42,
    "var_numeric_str": "42",
    "var_string": "foo-bar",
    "var_module": sys
}

jinja_templates = {
    "no-templates": "42",
    "no-templates-whitespace": "42 ",
    "template-var-int": "{{ var_int }}",
    "template-var-numeric-str": "{{ var_numeric_str }}",
    "template-var-string": "{{ var_string }}",
    "template-var-module": "{{ var_module }}",
}

for template_names, string_template in jinja_templates.items():
    print(f" {template_names}: {string_template!r} ".center(100, "="))
    template = environment.from_string(string_template)
    result = template.render(**jinja_context)
    print(f"Result: {result!r}, Type: {type(result)}")
======================================== no-templates: '42' ========================================
Result: 42, Type: <class 'int'>
================================== no-templates-whitespace: '42 ' ==================================
Result: 42, Type: <class 'int'>
================================ template-var-int: '{{ var_int }}' =================================
Result: 42, Type: <class 'int'>
======================== template-var-numeric-str: '{{ var_numeric_str }}' =========================
Result: 42, Type: <class 'int'>
============================= template-var-string: '{{ var_string }}' ==============================
Result: 'foo-bar', Type: <class 'str'>
============================= template-var-module: '{{ var_module }}' ==============================
Result: <module 'sys' (built-in)>, Type: <class 'module'>

Related Issues / comments:

psyking841 commented 1 year ago

Just surround the number with \", it should solve your problem, for example: "life": "\"42\""

CptTZ commented 1 year ago

Thanks @psyking841 , I tried single quote and it also works.

import sys
from jinja2.nativetypes import NativeEnvironment

environment = NativeEnvironment()

jinja_context = {
    "var_int": 42,
    "var_explicit_numeric_str": "'42'",
    "var_numeric_str": "42",
    "var_string": "foo-bar",
    "var_module": sys
}

jinja_templates = {
    "no-templates": "42",
    "no-templates-whitespace": "42 ",
    "template-var-int": "{{ var_int }}",
    "template-var-explicit-numeric-str": "{{ var_explicit_numeric_str }}",
    "template-var-numeric-str": "{{ var_numeric_str }}",
    "template-var-string": "{{ var_string }}",
    "template-var-module": "{{ var_module }}",
}

for template_names, string_template in jinja_templates.items():
    print(f" {template_names}: {string_template!r} ".center(100, "="))
    template = environment.from_string(string_template)
    result = template.render(**jinja_context)
    print(f"Result: {result!r}, Type: {type(result)}")

...

======================================== no-templates: '42' ========================================
Result: 42, Type: <class 'int'>
================================== no-templates-whitespace: '42 ' ==================================
Result: 42, Type: <class 'int'>
================================ template-var-int: '{{ var_int }}' =================================
Result: 42, Type: <class 'int'>
=============== template-var-explicit-numeric-str: '{{ var_explicit_numeric_str }}' ================
Result: '42', Type: <class 'str'>
======================== template-var-numeric-str: '{{ var_numeric_str }}' =========================
Result: 42, Type: <class 'int'>
============================= template-var-string: '{{ var_string }}' ==============================
Result: 'foo-bar', Type: <class 'str'>
============================= template-var-module: '{{ var_module }}' ==============================
Result: <module 'sys' (built-in)>, Type: <class 'module'>

@Taragolis maybe I can help documenting this behavior?

jscheffl commented 1 year ago

Oh, yeah. You can easily help and this even with no complex dev env setup. Use the page https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/params.html#referencing-params-in-a-task (where the feature is described) and click the button on the lower right. Edit the page and submit your first PR - would be great having this!

Taragolis commented 1 year ago

@CptTZ It would be awesome!

ntnhaatj commented 4 months ago

why does the Jinja renderer apply coercion on literal input provided by the user? IMO, it should not