A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
BSD 2-Clause "Simplified" License
297
stars
73
forks
source link
Transport-Spark: Ensure initialize is called before eval #113
Previously, there's no contract that can let us make sure the StdUdfWrapper 's initialize method will always happen-before the eval method, rendering NPEs where the fields inside initialize not initialized when they are directly accessed in eval.
This change utilizes the lazy val feature of scala to makes sure the happens-before contract of initialize and eval. We also adds a contract to make initialize happens-beforecheckInputDataTypes, as checkInputDataTypes is also part spark's Expression contract.
Previously, there's no contract that can let us make sure the
StdUdfWrapper
'sinitialize
method will always happen-before theeval
method, renderingNPE
s where the fields insideinitialize
not initialized when they are directly accessed ineval
.This change utilizes the
lazy val
feature of scala to makes sure the happens-before contract ofinitialize
andeval
. We also adds a contract to makeinitialize
happens-beforecheckInputDataTypes
, ascheckInputDataTypes
is also part spark'sExpression
contract.