ddluke commented 11 months ago

Note that this is my first issue in this project, I hope the classification of a bug is correct, but please correct me if I'm mistaken here.

Description

There appears to be no way to prevent jinja from mutating/corrupting types of the templating context when rendering.

This issue causes nightmares in the situation where jinja is not used to render some sort of html template or the likes. A very popular and broadly applied usecase would be apache-airflow, where jinja templating is a builtin feature of operators.

In this case, the operator can mark individual fields of the class constructor as template-able, and dynamic jinja blocks inside arbitrary operator instance fields can be replaced with dynamic jinja references at execution time.

The issue is: There appears to be no way to tell jinja to preserve the types of the templating context when resolving jinja refs.

At least the following two approaches do not work (are there any other approaches I'm not aware of?):

The default jinja environment apparently dumps all resolved jinja references to string, which causes data corruption if the jinja template ref points to anything other than a string (some integer, a list, a dict, a datetime object, etc.).
However, the jinja native environment is also no help in this scenario, because it also causes data corruption (the ast.literal_eval approach is not safe to use as seen in below example)

Code Replication

The following code snipped and the two test functions demonstrate the issue that results from this bug (if it is one)

from typing import Any

import jinja2
import jinja2.nativetypes

# this is the object we want to render (note that we do not want to render to a string here!)
training_config = {
    "vpc_subnet_ids": "{{ subnet_ids }}",
    "python_version": "{{ python_version }}"
}

# this is the templating context
templating_context = {
    "subnet_ids": ["subnet-1", "subnet-2"],
    "python_version": "3.10"
}

# this is what we want to get
desired_object = {
    "vpc_subnet_ids": ["subnet-1", "subnet-2"],
    "python_version": "3.10"
}

def render(data: dict[str, str], context: dict[str, Any], env: jinja2.Environment) -> dict[str, Any]:
    """a simplified rendering function """
    for k, v in data.items():
        template = env.from_string(v)
        data[k] = template.render(context)
    return data

def test_with_standard_env():
    result = render(training_config, templating_context, jinja2.Environment())
    assert result == desired_object, result
    # assertion error, because vpc_subnet_ids is now a string ...
    # {'vpc_subnet_ids': "['subnet-1', 'subnet-2']", 'python_version': '3.10'}

def test_with_native_env():
    result = render(training_config, templating_context, jinja2.nativetypes.NativeEnvironment())
    assert result == desired_object, result
    # assertion error, because python version is now 3.1 ...
    # {'vpc_subnet_ids': ['subnet-1', 'subnet-2'], 'python_version': 3.1}

Environment

Python version: 3.10.10
Jinja version: 3.1.2

davidism commented 11 months ago

This sounds like it's up to Airflow to address with how they use Jinja. Jinja is a string templating language, so the fact that rendering produces strings is not a bug. You may also be interested in Jinja's NativeEnvironment instead of the default one.

ddluke commented 11 months ago

Thanks for your reply, I understand that jinja is a string templating language. Alas, I cannot easily replace the templating engine used by apache-airflow. And Jinja's NativeEnvironment alas doesn't help either (see how python version above becomes 3.1, where it should really be 3.10).

pallets / jinja

Cannot preserve types in templating context #1895

Description

Code Replication

Environment