jitsucom / jitsu

Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
https://jitsu.com
MIT License
4.12k stars 295 forks source link

Support transformation from data coming from pull sources #952

Open vklimontovich opened 2 years ago

vklimontovich commented 2 years ago

Problem

Currently, JavaScript Transformations allows to transform data coming from event stream (API Keys). Sometimes, users want to transform data coming to sources too.

While transforming data coming from sources is generally a bad idea — data is better to be processed after it got to database with a tool such as DBT — in some cases it make sense. Those cases are trivial changes, such as renaming fields and column typing. The later is something that can't be done with DBT efficiently.

Solution

All data coming from sources should go through JS engine. To distinguish data coming from events and sources, users would need to opt-in to data coming from sources by:

$.DATA_PULL_TRANSFORM=true
$.DATA_PUSH_TRANSFORM=false // to disable event transform
//transformation

Certain features can be disabled for sources transforms if it's they make implementation harder:

sashayakovtseva commented 2 years ago

+1 for this one

absorbb commented 1 year ago

@sashayakovtseva Experimental support design was added in 1.43.3