delta-io / delta-sharing

An open protocol for secure data sharing
https://delta.io/sharing
Apache License 2.0
753 stars 168 forks source link

load_as_spark() returns error "No active SparkSession was found. load_as_spark requires running in a PySpark application." when used in Django Rest Api. #508

Open Shabbir-Khan-12 opened 3 months ago

Shabbir-Khan-12 commented 3 months ago

Hi all I am trying to load a delta table using load_as_spark() from the delta-sharing library in a rest API in the Django app. The issue is that when I run my Django service a spark session ( from a different app in the same project ) starts on startup and when I hit the rest API for reading the delta table using delta protocol it gives me an error: "No active SparkSession was found. load_as_spark requires running in a PySpark application." Due to a condition check in load_as_spark(): spark = SparkSession.getActiveSession() assert spark is not None, ( "No active SparkSession was found. " "load_as_sparkrequires running in a PySpark application." )

the SparkSession.getActiveSession() returns a spark session only if the spark session is from the current thread which in my case was started in a different thread and therefore I am stuck in loading data from delta table.