Open bdstuart opened 2 years ago
SegmentProcessorFramework
. We can do sth similar to the ingestion config. Currently, map phase hardcodes to initialize the default composite transformer.@jtao15 @Jackie-Jiang
For this feature request, another solution is to just not read the value for the skipped columns. The existing transformer can handle the filling of default values. @snleee Do you see other use cases where we want custom transform other than the one in the ingestion config?
@snleee are you planning to pick this up?
Here is what I said in the pinot troubleshooting channel: If this works as I think it might I could have the best of both worlds maybe. A certain amount of my data is in realtime table w/ the event_id for potential auditing, then as I move to offline table I default event_id to 0 and get good rollup.
To which @Jackie-Jiang repsonded: It is absolutely reasonable. We don't support it currently, but it is doable. Essentially we need to add a new task config to skip some columns when running the task in ROLLUP or DEDUP mode. Internally we will fill default values to these columns so that they won't be considered.