This issue was originally reported in the geodocker repository, but I have migrated it here because it appears to be a deeper issue than just configuration.
There is evidently a difference between when/how jars are loaded when a GeoPySpark python script is spark-submited versus when one is run in Jupyter (in which code seems to be piped through a spark-submited pyspark-shell).
A simple test script can succeed when it is run like this
18/02/05 22:00:37 INFO SparkContext: Added JAR /opt/jars/geotrellis-backend-assembly-0.3.1.jar at spark://172.31.26.186:45578/jars/geotrellis-backend-assembly-0.3.1.jar with timestamp 1517868037194
indicating that the required jar has been loaded but evidently not at the right time and/or in the right way.
This issue was originally reported in the geodocker repository, but I have migrated it here because it appears to be a deeper issue than just configuration.
There is evidently a difference between when/how jars are loaded when a GeoPySpark python script is
spark-submit
ed versus when one is run in Jupyter (in which code seems to be piped through aspark-submit
edpyspark-shell
).A simple test script can succeed when it is run like this
but not like this
Log output from the latter case displays this
indicating that the required jar has been loaded but evidently not at the right time and/or in the right way.