linkedin / transport

A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
BSD 2-Clause "Simplified" License
297 stars 73 forks source link

Transport-Spark: Ensure initialize is called before eval #113

Closed rzhang10 closed 2 years ago

rzhang10 commented 2 years ago

Previously, there's no contract that can let us make sure the StdUdfWrapper 's initialize method will always happen-before the eval method, rendering NPEs where the fields inside initialize not initialized when they are directly accessed in eval.

This change utilizes the lazy val feature of scala to makes sure the happens-before contract of initialize and eval. We also adds a contract to make initialize happens-beforecheckInputDataTypes, as checkInputDataTypes is also part spark's Expression contract.