DataCater / datacater

The developer-friendly ETL platform for transforming data in real-time. Based on Apache Kafka® and Kubernetes®.
https://datacater.io
Other
82 stars 4 forks source link

Support record-level Python transforms #1

Closed flippingbits closed 1 year ago

flippingbits commented 2 years ago

DataCater allows users to filter and transform data using Python functions. Despite having access to entire records, they can change only one column at a time. Having the possibility to change entire rows at a time is useful for many use cases, one of them being the flattening of JSON objects.

Our current format for transforms is as follows:

def transform(row, value, config):
  return value

We should keep the same format but allow to return entire rows, e.g.:

def transform(row, value, config):
  return row