sagemathinc / cocalc

CoCalc: Collaborative Calculation in the Cloud
https://CoCalc.com
Other
1.16k stars 211 forks source link

Getting module not found error after installing and verifying module. Can only reproduce within CoCalc. #5838

Closed jt0dd closed 1 month ago

jt0dd commented 2 years ago

We think this isn't a problem with the module we're installing (ElasticSearch's Eland), most likely: https://github.com/elastic/eland/issues/454

We're installing a module, verifying its existence with pip show and then getting a module not found error in Jupyter. But we cannot reproduce on any natural installation of Jupyter Notebook or Jupyter Lab. We have tested in Kali and Ubuntu.

So we think this is an issue with CoCalc.

Install Jupyter:
└─$ python3 -m pip install jupyter
... snip, successful ...
└─$ python3 -m pip show jupyter
Name: jupyter
Version: 1.0.0
Summary: Jupyter metapackage. Install all the Jupyter components in one go.
Home-page: http://jupyter.org
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.org
License: BSD
Location: /home/jtodd/.local/lib/python3.9/site-packages
Requires: ipykernel, ipywidgets, jupyter-console, nbconvert, notebook, qtconsole
Required-by:

I did have one issue running jupyter notebook related to markupsafe, and ran the following as a fix:

└─$ python3 -m pip install markupsafe==2.0.1

Then of course:

jupyter notebook

inside Jupyter Notebook, on the default Python3 kernel:

! python3 -m pip install eland
! python3 -m pip show eland
import eland as ed
print(ed)

output:

Defaulting to user installation because normal site-packages is not writeable
Collecting eland
  Downloading eland-8.1.0-py3-none-any.whl (137 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 137.9/137.9 KB 2.2 MB/s eta 0:00:00a 0:00:01
Collecting pandas<1.4,>=1.2
  Downloading pandas-1.3.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.5/11.5 MB 3.9 MB/s eta 0:00:0000:0100:01
Requirement already satisfied: numpy in /usr/lib/python3/dist-packages (from eland) (1.19.5)
Collecting elasticsearch<9,>=8
  Downloading elasticsearch-8.1.2-py3-none-any.whl (372 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 372.8/372.8 KB 3.9 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: matplotlib in /usr/lib/python3/dist-packages (from eland) (3.3.4)
Collecting elastic-transport<9,>=8
  Downloading elastic_transport-8.1.1-py3-none-any.whl (58 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.5/58.5 KB 2.0 MB/s eta 0:00:00
Requirement already satisfied: pytz>=2017.3 in /usr/lib/python3/dist-packages (from pandas<1.4,>=1.2->eland) (2021.1)
Requirement already satisfied: python-dateutil>=2.7.3 in /home/jtodd/.local/lib/python3.9/site-packages (from pandas<1.4,>=1.2->eland) (2.8.2)
Requirement already satisfied: urllib3<2,>=1.26.2 in /usr/lib/python3/dist-packages (from elastic-transport<9,>=8->elasticsearch<9,>=8->eland) (1.26.5)
Requirement already satisfied: certifi in /usr/lib/python3/dist-packages (from elastic-transport<9,>=8->elasticsearch<9,>=8->eland) (2020.6.20)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.7.3->pandas<1.4,>=1.2->eland) (1.16.0)
Installing collected packages: elastic-transport, pandas, elasticsearch, eland
Successfully installed eland-8.1.0 elastic-transport-8.1.1 elasticsearch-8.1.2 pandas-1.3.5
Name: eland
Version: 8.1.0
Summary: Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Home-page: https://github.com/elastic/eland
Author: Steve Dodson
Author-email: steve.dodson@elastic.co
License: Apache-2.0
Location: /home/jtodd/.local/lib/python3.9/site-packages
Requires: elasticsearch, matplotlib, numpy, pandas
Required-by: 
<module 'eland' from '/home/jtodd/.local/lib/python3.9/site-packages/eland/__init__.py'>

So at least on Kali running on WSL, this test would suggest the problem is not with Jupyter Notebook.

Next just to be thorough we test Jupyter Lab:

└─$ python3 -m pip show jupyterlab

output:

Name: jupyterlab
Version: 3.3.2
Summary: JupyterLab computational environment
Home-page: https://jupyter.org
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.com
License: UNKNOWN
Location: /home/jtodd/.local/lib/python3.9/site-packages
Requires: ipython, jinja2, jupyter-core, jupyter-server, jupyterlab-server, nbclassic, packaging, tornado
Required-by:

In the lab:

! python3 -m pip install eland
! python3 -m pip show eland
import eland as ed
print(ed)

We get the output:

Requirement already satisfied: eland in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (8.1.0)
Requirement already satisfied: matplotlib in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from eland) (3.5.1)
Requirement already satisfied: numpy in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from eland) (1.22.3)
Requirement already satisfied: pandas<1.4,>=1.2 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from eland) (1.3.5)
Requirement already satisfied: elasticsearch<9,>=8 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from eland) (8.1.2)
Requirement already satisfied: elastic-transport<9,>=8 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from elasticsearch<9,>=8->eland) (8.1.1)
Requirement already satisfied: pytz>=2017.3 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from pandas<1.4,>=1.2->eland) (2021.3)
Requirement already satisfied: python-dateutil>=2.7.3 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from pandas<1.4,>=1.2->eland) (2.8.2)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from matplotlib->eland) (3.0.1)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from matplotlib->eland) (4.31.1)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from matplotlib->eland) (1.4.0)
Requirement already satisfied: pillow>=6.2.0 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from matplotlib->eland) (9.0.1)
Requirement already satisfied: cycler>=0.10 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from matplotlib->eland) (0.11.0)
Requirement already satisfied: packaging>=20.0 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from matplotlib->eland) (21.0)
Requirement already satisfied: urllib3<2,>=1.26.2 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from elastic-transport<9,>=8->elasticsearch<9,>=8->eland) (1.26.7)
Requirement already satisfied: certifi in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from elastic-transport<9,>=8->elasticsearch<9,>=8->eland) (2021.10.8)
Requirement already satisfied: six>=1.5 in c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages (from python-dateutil>=2.7.3->pandas<1.4,>=1.2->eland) (1.16.0)
Name: eland
Version: 8.1.0
Summary: Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Home-page: https://github.com/elastic/eland
Author: Steve Dodson
Author-email: steve.dodson@elastic.co
License: Apache-2.0
Location: c:\users\jonat\appdata\local\packages\pythonsoftwarefoundation.python.3.9_qbz5n2kfra8p0\localcache\local-packages\python39\site-packages
Requires: elasticsearch, matplotlib, numpy, pandas
Required-by: 
<module 'eland' from 'C:\\Users\\jonat\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python39\\site-packages\\eland\\__init__.py'>
williamstein commented 2 years ago

Thanks for doing all this testing. I can't reproduce your problem in the most straightforward way on cocalc, i.e., it works for me.

Screen Shot 2022-04-06 at 10 11 45 AM Screen Shot 2022-04-06 at 10 12 27 AM
jt0dd commented 2 years ago

@williamstein Ok, let's discuss it here instead of the support ticket I opened, better formatting.

So I was able to reproduce the problem in a new project, running the commands tested above (not your commands, I believe you that those work, and we will definitely use that to get past this issue and continue our work):

! pip install eland
! pip show eland
import eland as ed
print(ed)
134.497 seconds
Defaulting to user installation because normal site-packages is not writeable
Collecting eland
  Downloading eland-8.1.0-py3-none-any.whl (137 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 137.9/137.9 KB 7.4 MB/s eta 0:00:00
Requirement already satisfied: numpy in /usr/local/lib/python3.8/dist-packages (from eland) (1.21.5)
Collecting elasticsearch<9,>=8
  Downloading elasticsearch-8.1.2-py3-none-any.whl (372 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 372.8/372.8 KB 33.0 MB/s eta 0:00:00
Requirement already satisfied: matplotlib in /usr/local/lib/python3.8/dist-packages (from eland) (3.5.1)
Collecting pandas<1.4,>=1.2
  Downloading pandas-1.3.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.5/11.5 MB 231.4 MB/s eta 0:00:00a 0:00:01
Collecting elastic-transport<9,>=8
  Downloading elastic_transport-8.1.1-py3-none-any.whl (58 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.5/58.5 KB 208.3 MB/s eta 0:00:00
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas<1.4,>=1.2->eland) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas<1.4,>=1.2->eland) (2021.3)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.8/dist-packages (from matplotlib->eland) (4.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.8/dist-packages (from matplotlib->eland) (21.3)
Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.8/dist-packages (from matplotlib->eland) (2.4.7)
Requirement already satisfied: cycler>=0.10 in /usr/lib/python3/dist-packages (from matplotlib->eland) (0.10.0)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.8/dist-packages (from matplotlib->eland) (9.0.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/lib/python3/dist-packages (from matplotlib->eland) (1.0.1)
Requirement already satisfied: certifi in /usr/local/lib/python3.8/dist-packages (from elastic-transport<9,>=8->elasticsearch<9,>=8->eland) (2021.10.8)
Requirement already satisfied: urllib3<2,>=1.26.2 in /usr/local/lib/python3.8/dist-packages (from elastic-transport<9,>=8->elasticsearch<9,>=8->eland) (1.26.7)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.8/dist-packages (from python-dateutil>=2.7.3->pandas<1.4,>=1.2->eland) (1.15.0)
Installing collected packages: elastic-transport, pandas, elasticsearch, eland
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
parameter-sherpa 1.0.6 requires enum34, which is not installed.
pandas-gbq 0.13.2 requires google-cloud-bigquery>=1.11.1, which is not installed.
pysal 2.2.0 requires python-dateutil<=2.8.0, but you have python-dateutil 2.8.1 which is incompatible.
pysal 2.2.0 requires urllib3<1.25, but you have urllib3 1.26.7 which is incompatible.
pandas-profiling 3.1.0 requires joblib~=1.0.1, but you have joblib 1.1.0 which is incompatible.
gs-quant 0.8.368 requires pandas<=1.2.4, but you have pandas 1.3.5 which is incompatible.
gs-quant 0.8.368 requires statsmodels<0.13.0,>=0.11.1, but you have statsmodels 0.13.2 which is incompatible.
arviz 0.11.4 requires typing-extensions<4,>=3.7.4.3, but you have typing-extensions 4.1.1 which is incompatible.
Successfully installed eland-8.1.0 elastic-transport-8.1.1 elasticsearch-8.1.2 pandas-1.3.5
Name: eland
Version: 8.1.0
Summary: Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Home-page: https://github.com/elastic/eland
Author: Steve Dodson
Author-email: steve.dodson@elastic.co
License: Apache-2.0
Location: /home/user/.local/lib/python3.8/site-packages
Requires: elasticsearch, matplotlib, numpy, pandas
Required-by: 
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/tmp/ipykernel_458/3275718928.py in <cell line: 3>()
      1 get_ipython().system(' pip install eland')
      2 get_ipython().system(' pip show eland')
----> 3 import eland as ed
      4 print(ed)
ModuleNotFoundError: No module named 'eland'

So even though you were able to solve the problem running pip3 in the console (and I just checked: ! pip3 install fails in the same manner), it does seem like a CoCalc related problem that we can't use the ! pip3 install syntax inside the Notebook as we can in the examples I reproduced outside of CoCalc. It's rather unintuitive for that to not work, so it seems like it might be beneficial to figure out the underlying problem there.

Nevertheless, your work-around is definitely easy to do, and that will do the trick for us!

jt0dd commented 2 years ago

One benefit of having Jupyter's ! pip3 install syntax working (rather than doing it directly in the console) is that way we can share the Notebook file between projects, machines, and even other researchers in different environments, and that way the Notebook just works without the user needing to run those extra commands.

williamstein commented 2 years ago

The difference you're seeing is likely related to the shell environment variables. Thanks for the steps to reproduce the problem.

and that way the Notebook just works without the user needing to run those extra commands.

For clarification -- they do have to run those commands, but they are running them in the notebook rather than a terminal?

jt0dd commented 2 years ago

For clarification -- they do have to run those commands, but they are running them in the notebook rather than a terminal?

Yes, right. It's the difference between sending the notebook to an inexperienced member on the team and it just working and me getting some question. For example, Jupyter Notebooks are actually really good for working with people who don't know much Python (due to the plaintext / formatted blocks and embedded HTML widgets). Jupyter Lab lets you collapse a lot of the code blocks, and that way analysts (in our case threat hunting host analysts) who understand scripting basics (think PowerShell) but don't necessarily know any Python can just modify a few values, but other than that, you send them the notebook on an environment running Jupyter, and it just works. That's valuable.

It's not a big deal, we can work around the problem easily. I was just highlighting a reason it might be good to have the feature working properly.

williamstein commented 2 years ago

It's the difference between sending the notebook to an inexperienced member on the team and it just working and me getting some question.

For CoCalc at least, we will very likely install eland so it is available across all CoCalc projects; that may help until this issue gets fixed.

CoCalc's jupyter also supports hiding or setting to read only any code blocks (see the edit menu with cells selected).