Need to set
os.environ["PYSPARK_PYTHON"] = "/home/ubuntu/anaconda3/bin/python"
In my local Jupyter where PYSPARK_PYTHON is of worker machines.
Otherwise it takes my sys.executable to cluster and i get error from workers stating python file not found.
Isn't this is the expected behavior? If your cluster has some special configuration it has to be provided to findspark otherwise it only look at common places to set the environment variable.
Need to set
os.environ["PYSPARK_PYTHON"] = "/home/ubuntu/anaconda3/bin/python"
In my local Jupyter where
PYSPARK_PYTHON
is of worker machines. Otherwise it takes mysys.executable
to cluster and i get error from workers stating python file not found.