Closed vpipkt closed 4 years ago
Reproducing in the notebook may center around the python subprocess.Popen
call at pyspark/java_gateway.py:98
Here's what I extracted from debugger. NOte the env['_PYSPARK_DRIVER_CONN_INFO_PATH']
is created at run time then i think removed by the spark-submit call.
The only place we see reference to bash
is in the env['SHELL']= '/bin/bash'
. There is no explicit reference to usr/bin/env
in the arguments here. The actual script / binary spark-submit
has the hashbang line: #!/usr/bin/env bash
... But not sure what about the geopandas import has caused this to happen.
from subprocess import Popen, PIPE
import signal
command = ['/usr/local/spark/./bin/spark-submit', '--conf', 'spark.master=local[*]', '--conf', 'spark.app.name=RasterFrames', '--conf', 'spark.jars=/opt/conda/lib/python3.7/site-packages/pyrasterframes/jars/pyrasterframes-assembly-0.8.5.jar', '--conf', 'spark.serializer=org.apache.spark.serializer.KryoSerializer', '--conf', 'spark.kryo.registrator=org.locationtech.rasterframes.util.RFKryoRegistrator', '--conf', 'spark.kryoserializer.buffer.max=500m', 'pyspark-shell']
env = {'LC_ALL': 'en_US.UTF-8', 'LD_LIBRARY_PATH': ':/opt/conda/lib',
'APACHE_SPARK_REMOTE_PATH': 'spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz',
'LANG': 'en_US.UTF-8', 'HOSTNAME': '37a98020404f', 'NB_UID': '1000',
'CONDA_DIR': '/opt/conda', 'CONDA_VERSION': '4.7.12', 'PWD': '/home/jovyan',
'HOME': '/home/jovyan', 'MINICONDA_MD5': '1c945f2b3335c7b2b15130b1b2dc5cf4',
'DEBIAN_FRONTEND': 'noninteractive', 'SPARK_HOME': '/usr/local/spark',
'NB_USER': 'jovyan', 'HADOOP_VERSION': '2.7',
'APACHE_SPARK_FILENAME': 'spark-2.4.4-bin-hadoop2.7.tgz', 'SHELL': '/bin/bash',
'SPARK_OPTS': '--driver-java-options=-Xms1024M --driver-java-options=-Xmx4096M --driver-java-options=-Dlog4j.logLevel=info',
'APACHE_SPARK_VERSION': '2.4.4', 'SHLVL': '0', 'LANGUAGE': 'en_US.UTF-8',
'PYTHONPATH': '/usr/local/spark/python:/usr/local/spark/python/lib/py4j-0.10.7-src.zip', 'RF_LIB_LOC': '/usr/local/rasterframes',
'APACHE_SPARK_CHECKSUM': '2E3A5C853B9F28C7D4525C0ADCB0D971B73AD47D5CCE138C85335B9F53A6519540D3923CB0B5CEE41E386E49AE8A409A51AB7194BA11A254E037A848D0C4A9E5',
'XDG_CACHE_HOME': '/home/jovyan/.cache/', 'NB_GID': '100', 'PATH': '/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin;/opt/conda/lib',
'MINICONDA_VERSION': '4.7.10', 'KERNEL_LAUNCH_TIMEOUT': '40', 'JPY_PARENT_PID': '7', 'TERM': 'xterm-color', 'CLICOLOR': '1', 'PAGER': 'cat', 'GIT_PAGER': 'cat', 'MPLBACKEND': 'module://ipykernel.pylab.backend_inline',
'_PYSPARK_DRIVER_CONN_INFO_PATH': '/tmp/tmpeomq27my/tmpf81aik85'}
def preexec_func():
signal.signal(signal.SIGINT, signal.SIG_IGN)
proc = Popen(command, stdin=PIPE, preexec_fn=preexec_func, env=env)
In a notebook, I did this:
and in the 2nd case with the 127 return code, I see the message /usr/bin/env: ‘bash’: No such file or directory
in stderr
Is geopandas
stomping on env
?
Yes... seems to be something about conda ?
Will try to figure out more...
Semicolon is for windows....
I have narrowed it down to this:
$ python -c "import os; print(';' in os.environ['PATH']); import rtree; print(';' in os.environ['PATH']);"
False
True
So basically it is something to do with rtree either the specific version (0.9.1) or the way it's packaged with conda.
Here is the issue:
https://github.com/Toblerity/rtree/issues/126
Fixed with this PR: https://github.com/Toblerity/rtree/pull/125
Merged Dec 3, 2019. Fix should be available from versions 0.9.2 onward.
Working on a fix to this that upgrades the container's minimum version of rtree
.
As originally reported by @mjgolebiewski in the
s22s/rasterframes-notebook:0.8.5
image (cbc6ce228c8e), the following code results in a runtime errorImplicitly
import geopandas
is defining some version checking utilities thatimport rtree
which at version 0.9.0 and 0.9.1 used literal semicolon instead ofos.pathsep
Error details:
And in the stdout of the notebook I see:
/usr/bin/env: ‘bash’: No such file or directory
. This is a subtle point and I think important.https://gitter.im/locationtech/rasterframes?at=5e2596be5b81ab262e5ae82b