linkedin / transport

A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
BSD 2-Clause "Simplified" License
291 stars 72 forks source link

Transport-spark: Only add files to sparkContext from driver #115

Closed rzhang10 closed 2 years ago

rzhang10 commented 2 years ago

Before this patch, the getRequiredFiles will access SparkSession when it gets called on the spark executor, which will throw an exception:

java.lang.IllegalStateException: SparkSession should only be created and accessed on the driver

This patch adds a condition check to make sure the getRequiredFiles is only called on the driver, see ref: https://github.com/apache/spark/blob/93f646dd00ba8b3370bb904ba91862c407c62cc2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L1154