💨🥫 A Data Factory system for running data processing pipelines built on AirFlow and tailored to CKAN. Includes evolution of DataPusher and Xloader for loading data to DataStore.
At this time they are sent via array in the DAG params (e.g. "schema_fields_array": "['field1', 'field2']") and everything is treated as text type.
What are good alternatives?
Questions:
Modify this array to become a dictionary of names and types. What is a good way to pass it? I believe doing it in plain text can be tedious. What about adding another node on the DAG to fetch the header of the CSV and automatically create a dictionary of fields, hard-coding everything to text?
At this time they are sent via array in the DAG params (e.g.
"schema_fields_array": "['field1', 'field2']"
) and everything is treated astext
type. What are good alternatives? Questions: