Closed jhconning closed 8 years ago
I've figured out how to install all the packages I need. Here is what I had to understand and how I got it to work:
pip3 will install some packages but fails to install other essential packages like matplotlib and pandas. The issue is that pip3 (and pip) is a package manager that builds python packages and its python package dependencies but it cannot by itself build 'system-wide' dependencies (e.g. non-python packages that need to be compiled in C). But these packages can be installed directly from binaries, and this is in fact the recommended method.
Since DHBox is running on Ubuntu and the Jupyter notebook server is using a python3 kernel, this is how I installed numpy, matplotlib, pandas and scipy (info from the matplotlib installation page and other sources):
sudo apt-get install python3-numpy
sudo apt-get install python3-matplotlib
sudo apt-get install python3-pandas
sudo apt-get install python3-scipy
Once these libraries are in place python libraries that have them as dependencies can be pip3 installed (e.g. seaborn, plotly).
One last tip: if you want to use matplotlib or any of the other above mentioned plotting libraries to display plots in the jupyter notebook you will need to first run a code cell with
%matplotlib inline
This is true in general but in other Jupyter notebook environments I've worked with forgetting to add that code line just resulted in my plots appearing in their own window rather than in the notebook, but in this DHBox environment it gives an error about $DISPLAY not being set.
Thanks to those who send suggestions via email.
Thanks for this! Still likely that we'll include those libraries in DH Box sometime soon.
I've been trying out the new jupyter notebook interface (with my colleague @mbaker21231 who is also interested in DHBox) and it fires up great!
We've noticed just one problem however which is that even though a `'pip freeze`` command tells us that several key python libraries (numpy, nltk, etc) have been installed they're not available when one tries to import them from a python3 kernel running in a Jupyter Notebook.
Trying to import numpy from a notebook cell produces the error:
I think I've seen this problem before on local installations. I'm pretty certain that the issue is that you have both python2 and python3 installed on linux. When you have two versions of python on the same machine
pip install nltk
makes the nltk library available to python2 but not to python 3. The library would need to be installed withpip3 install nltk
to make it available to python3.