Closed laurent-thiebaud-gisaia closed 4 years ago
Another option is to declare the UDF asNonDeterministic(), but I we use the resulting columns in filters then they will not be optimized in the query plan (see https://stackoverflow.com/questions/58696198/spark-udf-executed-many-times?noredirect=1#comment103690028_58696198)
Otherwise, each time the "tmpAddressColumn" is used (i.a. for the six address properties), the UDF is executed once.