etsy / boundary-layer

Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform
Apache License 2.0
262 stars 58 forks source link

[generators] Fix bug during empty string conversion w/ generators #126

Closed vchiapaikeo closed 3 years ago

vchiapaikeo commented 3 years ago

Previously, this was true:

assert util.format_value("") == "()"

This isn't what we want when it comes to generator conversion and its usage here becuase what will end up happening is we will create a for loop like this:

for (index, item) in enumerate(language_sensors_iterable_builder(
            items = [{ 'name': 'model_A','index': ('a' if 1 else 'b') },{ 'name': 'model_B','index': () }],
        )):

What we actually want to represent is this:

for (index, item) in enumerate(language_sensors_iterable_builder(                           
            items = [{ 'name': 'model_A','index': ('a' if 1 else 'b') },{ 'name': 'model_B','index': '' }],
        )):                                                                                   

The difference is subtle but it will result in bugs during airflow test dag parsing as a result of a tuple being passed instead of an empty string type:

>>> {'a': (), 'b': ''}
{'a': (), 'b': ''}
>>> data = {'a': (), 'b': ''}
>>> data['a']
()
>>> type(data['a'])
<class 'tuple'>
>>> data['b']
''
>>> type(data['b'])
<class 'str'>
vchiapaikeo commented 3 years ago

Thanksss! I'll merge next week so that we can test against DevDataConf then too :-)