boutrosrg / Predictive-Maintenance-In-PySpark

29 stars 10 forks source link

Python 2 seems to no longer play with Matplotlib #2

Open iholas opened 5 years ago

iholas commented 5 years ago

When following the environmental instructions for Notebook I Data Processing.ipnb , Matplotlib cannot plot. It seems there is a conflict in versions of Python, PySpark, iPyKernel and matplotlib.

In short, Python 2.7 forces PySpark installation that downgrades matplotlib to 1.5.1 This causes crash on all plots with a failed call to a to_rgb method somewhere internally.

Matplotlib can be massaged to 2.2.3, but problem persisist. I tried to downgrade ipykernel to 4.9.0 (as suggested somewhere), but that made no difference.

I am locally modifying the Notebook for Python3 (so far just print statements), with all updated packages, and it seems to be running fine (expect for the other issue I submitted, which seems unrelated to this).

Python 3.7 Anaconda Spark: 2.3.2 Pyspark 2.4.0 Matplotlib 3.0,2 ipykernel 5.1.0

I will submit a pull request when I have this confirmed working fully on Python3

boutrosrg commented 5 years ago

@iholas the notebooks are running properly under the following settings: Python 2.7.15rc1 IPython 5.5.0 Matploylib 2.2.0 Spark 2.3.2

Please note that I run these notebooks under Linux environment. It can happen that your Windows/Anaconda environment need some configurations to execute the code properly.