ceumicrodata / mETL

mito ETL tool
161 stars 41 forks source link

complex transformations - embedded python code instead of TARR #16

Open e3krisztian opened 11 years ago

e3krisztian commented 11 years ago

I was looking at the docs and using TARR in the config is a bit hairy to me. Is TARR really needed?

I think it would be cleaner if complex transformations would be supported only with python code extensions. There is already a documented way to do it, see below for another proposal.

(TARR) control structures would not be available in the YaML, but decisions are better handled in much simpler python code anyway.


If only a few simple transformations are needed, they could be written in python and be embedded in the YAML config file (e.g. under a new python key).

Transformations would be created by executing the value under the python key (if present) within python and looking at the resulting module object for names as transformations:

import yaml
from StringIO import StringIO
yaml_stream = StringIO('''\
python: |
    def f(param):
        print('*** output from embedded f({}) ***'.format(repr(param)))
''')

code = yaml.safe_load(yaml_stream)['python']
print('code from YaML:')
print(code)

embedded_module_dict = {}
exec(code, embedded_module_dict)

print('call function defined in YaML:')
embedded_module_dict['f']('hello world from yaml document')

YaML multiline syntax: http://michael.f1337.us/2010/03/30/482836205/