minrk / findspark

BSD 3-Clause "New" or "Revised" License
511 stars 72 forks source link

Not working with Remote Cluster with Jupyter on local #31

Open meedeepak opened 4 years ago

meedeepak commented 4 years ago

Need to set os.environ["PYSPARK_PYTHON"] = "/home/ubuntu/anaconda3/bin/python"

In my local Jupyter where PYSPARK_PYTHON is of worker machines. Otherwise it takes my sys.executable to cluster and i get error from workers stating python file not found.

Hydrugion commented 4 years ago

Isn't this is the expected behavior? If your cluster has some special configuration it has to be provided to findspark otherwise it only look at common places to set the environment variable.