Open twosom opened 6 days ago
.take-issue
Can you please elaborate what is needed in this I understood that we cant use filter options as it can have performance impact but we will have to change it totally so as to satisfy this requirement Can you guide me a little on this i think i can complete it.
Can you please elaborate what is needed in this I understood that we cant use filter options as it can have performance impact but we will have to change it totally so as to satisfy this requirement Can you guide me a little on this i think i can complete it.
@tejasrok007 Thanks for your comment. But I've already done the work and am testing it.
What needs to happen?
When evaluating ParDo operations in the TransformTranslator in Apache Spark Runner, too many filter operations are applied. The reason for applying filter operations is that a ParDo can have multiple outputs, so we apply filter operations to filter only elements such as each TupleTag.
However, the filter operation is also applied to a ParDo with a single output, which can have a performance impact. Therefore, we should avoid applying the filter operation when evaluating ParDo operations with a single output.
related mail context
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components