Open iholas opened 5 years ago
@iholas the notebooks are running properly under the following settings: Python 2.7.15rc1 IPython 5.5.0 Matploylib 2.2.0 Spark 2.3.2
Please note that I run these notebooks under Linux environment. It can happen that your Windows/Anaconda environment need some configurations to execute the code properly.
When following the environmental instructions for
Notebook I Data Processing.ipnb
, Matplotlib cannot plot. It seems there is a conflict in versions of Python, PySpark, iPyKernel and matplotlib.In short, Python 2.7 forces PySpark installation that downgrades matplotlib to 1.5.1 This causes crash on all plots with a failed call to a
to_rgb
method somewhere internally.Matplotlib can be massaged to 2.2.3, but problem persisist. I tried to downgrade ipykernel to 4.9.0 (as suggested somewhere), but that made no difference.
I am locally modifying the Notebook for Python3 (so far just print statements), with all updated packages, and it seems to be running fine (expect for the other issue I submitted, which seems unrelated to this).
Python 3.7 Anaconda Spark: 2.3.2 Pyspark 2.4.0 Matplotlib 3.0,2 ipykernel 5.1.0
I will submit a pull request when I have this confirmed working fully on Python3