zero-one-group / geni

A Clojure dataframe library that runs on Spark
Apache License 2.0
284 stars 28 forks source link

Loading geni.core creates a default spark session #332

Open gnarroway opened 3 years ago

gnarroway commented 3 years ago

Info

Info Value
Operating System rhel7 and windows
Geni Version 0.0.38
JDK 11.0.10
Spark Version 3.1.2

Problem / Steps to reproduce

If zero-one.geni.core has been required, it creates a default spark session which impacts the behavior of calling g/create-spark-session

Specifically, geni.core loads geni.spark-context which loads geni.defaults, which creates a spark session in an atom which should probably be a delay

(def s (g/create-spark-session {:app-name “foo”}))
(g/spark-conf s)
; => {…:spark.app.name Geni app…}
; which is the wrong name

if requiring zero-one.geni.spark directly instead (as g), the spark session is correctly configured.

The incorrect behaviour takes effect if core is required at any point before creating the session, so it is a bit problematic. As above, maybe replacing the default with a delay will be sufficient to avoid this.

Thanks for your work on this library!