Vilos92 / polynote

Unofficial Docker Image for Polynote https://polynote.org/
31 stars 0 forks source link

PySpark cannot find Python #3

Open jest opened 4 years ago

jest commented 4 years ago

python3 APK installs only /usr/bin/python3 binary, but by default PySpark searches for python binary in PATH. This results in kernel error when enabling Spark in a notebook:

2019-11-19T10:55:15.696952916Z /usr/bin/find-spark-home: line 40: python: command not found
2019-11-19T10:55:15.697407078Z /usr/bin/spark-submit: line 27: /bin/spark-class: No such file or directory

I see two solutions to this problem. In Dockerfile:

  1. Either link python to python3:
    RUN cd /usr/bin && ln -s python3 python
  2. or set PYSPARK_PYTHON variable (not tested, as for https://stackoverflow.com/questions/30279783/apache-spark-how-to-use-pyspark-with-python-3):
    ENV PYSPARK_PYTHON python3
jest commented 4 years ago

I also tested solution 2. and can confirm it works

Vilos92 commented 4 years ago

Hi @jest ,

Thanks for your suggestions on this and the other issue, and sorry for the slow response! Things have been quite busy for me since late November, and I hadn't been checking out this repo.

Both of your suggestions seem great, and I'll look to incorporate them later this week. Hope you have a Happy New Years! 🎊