ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.18k stars 1.19k forks source link

visualization throws error #597

Closed maschinenzeitmaschine closed 4 weeks ago

maschinenzeitmaschine commented 4 years ago

Describe the bug Although pip install ludwig[viz] was sucessful, any call of ludwig visualize results in the message matplotlib or seaborn are not installed. In order to install all visualization dependencies run pip install ludwig[viz]

To Reproduce Steps to reproduce the behavior:

$ pyenv virtualenv 3.7.5 ludwig
$ pyenv global ludwig
$ pip install ludwig
$ pip install ludwig[image]
$ pip install ludwig[viz]
$ ludwig train --data_csv trainingdata.csv --model_definition_file model_definition.yaml
$ ludwig visualize --visualization learning_curves --training_statistics results/experiment_run_1/training_statistics.json

Expected behavior Should render graphs and not return an error.

Environment (please complete the following information):

Additional context If I start Python from CLI, I can import matplotlib and seaborn with no problem. Apart, see shell output below:

portabel:ludwig user$ pip install ludwig[viz]
Requirement already satisfied: ludwig[viz] in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (0.2.1)
Requirement already satisfied: absl-py in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (0.8.1)
Requirement already satisfied: numpy>=1.15 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (1.17.4)
Requirement already satisfied: tqdm in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (4.40.2)
Requirement already satisfied: Cython>=0.25 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (0.29.14)
Requirement already satisfied: tensorflow==1.14.0 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (1.14.0)
Requirement already satisfied: PyYAML>=3.12 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (5.2)
Requirement already satisfied: pandas>=0.19 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (0.25.3)
Requirement already satisfied: scikit-learn in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (0.22)
Requirement already satisfied: tabulate>=0.7 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (0.8.6)
Requirement already satisfied: scipy>=0.18 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (1.3.3)
Requirement already satisfied: h5py>=2.6 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (2.10.0)
Requirement already satisfied: matplotlib>=3.0; extra == "viz" in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (3.1.2)
Requirement already satisfied: seaborn>=0.7; extra == "viz" in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from ludwig[viz]) (0.9.0)
Requirement already satisfied: six in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from absl-py->ludwig[viz]) (1.13.0)
Requirement already satisfied: gast>=0.2.0 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (0.3.2)
Requirement already satisfied: google-pasta>=0.1.6 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (0.1.8)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (1.1.0)
Requirement already satisfied: wheel>=0.26 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (0.33.6)
Requirement already satisfied: protobuf>=3.6.1 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (3.11.1)
Requirement already satisfied: tensorboard<1.15.0,>=1.14.0 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (1.14.0)
Requirement already satisfied: wrapt>=1.11.1 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (1.11.2)
Requirement already satisfied: astor>=0.6.0 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (0.8.1)
Requirement already satisfied: grpcio>=1.8.6 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (1.25.0)
Requirement already satisfied: keras-applications>=1.0.6 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (1.0.8)
Requirement already satisfied: termcolor>=1.1.0 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (1.1.0)
Requirement already satisfied: tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorflow==1.14.0->ludwig[viz]) (1.14.0)
Requirement already satisfied: python-dateutil>=2.6.1 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from pandas>=0.19->ludwig[viz]) (2.8.1)
Requirement already satisfied: pytz>=2017.2 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from pandas>=0.19->ludwig[viz]) (2019.3)
Requirement already satisfied: joblib>=0.11 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from scikit-learn->ludwig[viz]) (0.14.1)
Requirement already satisfied: cycler>=0.10 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from matplotlib>=3.0; extra == "viz"->ludwig[viz]) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from matplotlib>=3.0; extra == "viz"->ludwig[viz]) (1.1.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from matplotlib>=3.0; extra == "viz"->ludwig[viz]) (2.4.5)
Requirement already satisfied: setuptools in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from protobuf>=3.6.1->tensorflow==1.14.0->ludwig[viz]) (41.2.0)
Requirement already satisfied: werkzeug>=0.11.15 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow==1.14.0->ludwig[viz]) (0.16.0)
Requirement already satisfied: markdown>=2.6.8 in /Users/user/.pyenv/versions/3.7.5/envs/ludwig/lib/python3.7/site-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow==1.14.0->ludwig[viz]) (3.1.1)

portabel:ludwig user$ ludwig visualize --visualization learning_curves --training_statistics results/experiment_run_1/training_statistics.json
 matplotlib or seaborn are not installed. In order to install all visualization dependencies run pip install ludwig[viz]
maschinenzeitmaschine commented 4 years ago

edit: i just tried the same with python 3.6.8 but it produces the same result.

msaisumanth commented 4 years ago

@karlkrach could you please try importing matplotlib & seaborn in a python interpreter and check if it throws the same error?

maschinenzeitmaschine commented 4 years ago

i already did, see "additional context": if i start python from command line, i can run "import seaborn" and "import matplotlib" without problem. or did i misunderstand you? also: thanks for getting back to me!

msaisumanth commented 4 years ago

@karlkrach sorry I didn't notice that in your issue description.

So here's where the error is happening.

try:
    import matplotlib as mpl

    if platform == "darwin":  # OS X
        mpl.use('TkAgg')
    import matplotlib.patches as patches
    import matplotlib.path as path
    import matplotlib.patheffects as PathEffects
    import matplotlib.pyplot as plt
    import seaborn as sns
    from matplotlib import ticker
    from matplotlib.lines import Line2D
    from mpl_toolkits.mplot3d import Axes3D
except ImportError:
    logger.error(
        ' matplotlib or seaborn are not installed. '
        'In order to install all visualization dependencies run '
        'pip install ludwig[viz]'
    )
    sys.exit(-1)

Could you please help me narrow down the error by running those import commands?

I think it's because of mpl_toolkits, but I want to make sure. We might need to add another if platform == 'darwing' before the mpl_toolkits import.

Thanks

maschinenzeitmaschine commented 4 years ago

if i paste this in a script 1:1 and run, i get:

portabel:ludwig user$ python matplottest.py
Traceback (most recent call last):
  File "matplottest.py", line 4, in <module>
    if platform == "darwin":  # OS X
NameError: name 'platform' is not defined

if run the same but with lines 4 & 5 (if…) commented out, it does not throw any error.

msaisumanth commented 4 years ago

oops. you need to do from sys import platform first.

maschinenzeitmaschine commented 4 years ago

i added this line to to the top of the script. if i run that now, i get:

portable:ludwig myuser$ python matplottest.py
Traceback (most recent call last):
  File "matplottest.py", line 11, in <module>
    import matplotlib.pyplot as plt
  File "/Users/myuser/.pyenv/versions/ludwig/lib/python3.6/site-packages/matplotlib/pyplot.py", line 2349, in <module>
    switch_backend(rcParams["backend"])
  File "/Users/myuser/.pyenv/versions/ludwig/lib/python3.6/site-packages/matplotlib/pyplot.py", line 221, in switch_backend
    backend_mod = importlib.import_module(backend_name)
  File "/Users/myuser/.pyenv/versions/3.6.8/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/Users/myuser/.pyenv/versions/ludwig/lib/python3.6/site-packages/matplotlib/backends/backend_tkagg.py", line 1, in <module>
    from . import _backend_tk
  File "/Users/myuser/.pyenv/versions/ludwig/lib/python3.6/site-packages/matplotlib/backends/_backend_tk.py", line 6, in <module>
    import tkinter as tk
  File "/Users/myuser/.pyenv/versions/3.6.8/lib/python3.6/tkinter/__init__.py", line 36, in <module>
    import _tkinter # If this fails your Python may not be configured for Tk
ModuleNotFoundError: No module named '_tkinter'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "matplottest.py", line 17, in <module>
    logger.error(
NameError: name 'logger' is not defined
w4nderlust commented 4 years ago

This is weird, you are using Mac OS, so you should have TkAgg as a backend. That's the whole reason why there's that check if platform == 'darwin'. Could you please both tell us what is the value of the platform variable and try to replace TkAgg with agg as suggested here?

seekayel commented 4 years ago

Reproduced this exact issue. My platform variable is darwin

As suggested: mpl.use('agg') ran without issue.

Got same error when I updated examples/titanic/multiple_model_training.py (as referenced in matplotlib#9017):

import matplotlib as mpl
mpl.use('agg')

Other suggestions?

w4nderlust commented 4 years ago

@seekayel Thank you for reporting on this. Could you please tell us what's your version of MacOS and your version of Python and matplotlib? Asking because we introduced that if with TKAgg specifically for mac users because it was solving the problem they were encountering with matplotlib, while now it looks like that is the source of the problem, so I suspect it may have to do with different versions of MacOS shippint with different bundles of python.

seekayel commented 4 years ago

@w4nderlust

I am using pyenv to install my pythons and pipenv w/ Pipfile to manage dependencies and virtualenv creation.

MacOSX: 10.15.6 Python: 3.7.7 via pyenv matplotlib: 3.3.0 via pipenv

import matplotlib as mpl
import sys
print(f"sys.platform: {sys.platform}")
print(f"sys.version_info: {sys.version_info}")
print(f"mpl.__version__: {mpl.__version__}")
print(f"mpl.rcsetup.non_interactive_bk: {mpl.rcsetup.non_interactive_bk}")
print(f"mpl.rcsetup.interactive_bk: {mpl.rcsetup.interactive_bk}")

Output:

sys.platform: darwin
sys.version_info: sys.version_info(major=3, minor=7, micro=7, releaselevel='final', serial=0)
mpl.__version__: 3.3.0
mpl.rcsetup.non_interactive_bk: ['agg', 'cairo', 'pdf', 'pgf', 'ps', 'svg', 'template']
mpl.rcsetup.interactive_bk: ['GTK3Agg', 'GTK3Cairo', 'MacOSX', 'nbAgg', 'Qt4Agg', 'Qt4Cairo', 'Qt5Agg', 'Qt5Cairo', 'TkAgg', 'TkCairo', 'WebAgg', 'WX', 'WXAgg', 'WXCairo']

Anything other info you need?

w4nderlust commented 4 years ago

I tried to replicate the issue, but was not able to. Could you please try a couple things to see if they solve your issue? In that case i would at least have narrowed down the issue.

  1. Can you please do clean install of ludwig with a new virtual environment using virtualenv instead of pyenv and pipenv? Here my suspicion is that there are some differences in the way those tools install matplotlib that have the effect of not providing you with TKAgg.

  2. If that doesn't work, can you try replacing the backend with MacOSX? If that works I can add in the code several attempts, to load tkagg, then macosx and finaly agg and raise the error only if non of them loads.

Thank you for your patience.