mikeizbicki / cmc-csci143

big data course materials
40 stars 76 forks source link

Import Issues in AWS Sagemaker - Unrelated to class #135

Closed vbopardi closed 2 years ago

vbopardi commented 2 years ago

I am working on a project in AWS Sagemaker where I am using a machine learning package called snorkel.

I am having trouble importing the snorkel package when it is seemingly already installed.

The error from import snorkel is:

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-24-406fde6f1c00> in <module>
      7 import os
      8 import pandas as pd
----> 9 import snorkel
     10 #from snorkel.labeling import labeling_function

ModuleNotFoundError: No module named 'snorkel'

In the terminal, my current working directory is /home/sagemaker-user/p-ai-spring-2022/varun. When I run pip list to list all the installed packages I get the following output:

Package                        Version
------------------------------ -------------------
absl-py                        1.0.0
aioboto3                       9.2.0
aiobotocore                    1.3.3
aiohttp                        3.7.4
aioitertools                   0.6.0
argon2-cffi                    20.1.0
asn1crypto                     1.4.0
async-generator                1.10
async-timeout                  3.0.1
asynctest                      0.13.0
attrs                          19.3.0
aws-embedded-metrics           1.0.7
aws-jupyter-proxy              0.1.0
awscli                         1.21.10
backcall                       0.1.0
bleach                         3.1.4
boto3                          1.17.106
botocore                       1.22.10
brotlipy                       0.7.0
cachetools                     4.2.4
certifi                        2020.4.5.1
cffi                           1.14.5
chardet                        3.0.4
charset-normalizer             2.0.9
colorama                       0.4.3
conda                          4.6.14
cryptography                   3.4.6
decorator                      4.4.2
defusedxml                     0.6.0
docutils                       0.15.2
entrypoints                    0.3
gitdb                          4.0.4
GitPython                      3.1.1
google-auth                    1.35.0
google-auth-oauthlib           0.4.6
grpcio                         1.44.0
idna                           2.9
importlib-metadata             4.11.1
ipykernel                      5.2.0
ipython                        7.13.0
ipython-genutils               0.2.0
jedi                           0.16.0
jeepney                        0.6.0
Jinja2                         2.11.3
jmespath                       0.9.5
joblib                         1.1.0
json5                          0.9.4
jsonschema                     3.2.0
jupyter-client                 6.1.5
jupyter-core                   4.6.3
jupyter-server-proxy           1.3.2
jupyter-telemetry              0.1.0
jupyterlab                     1.2.21
jupyterlab-git                 0.11.0
jupyterlab-server              1.0.7
keyring                        22.0.1
Markdown                       3.3.6
MarkupSafe                     1.1.1
mistune                        0.8.4
multidict                      4.7.5
munkres                        1.1.4
nbconvert                      5.6.1
nbdime                         1.1.1
nbformat                       5.0.5
nest-asyncio                   1.5.4
networkx                       2.6.3
notebook                       6.4.1
numpy                          1.19.5
oauthlib                       3.2.0
packaging                      20.9
pandas                         1.3.5
pandocfilters                  1.4.2
parso                          0.6.2
pexpect                        4.8.0
pickleshare                    0.7.5
pip                            21.0.1
pkginfo                        1.7.0
prometheus-client              0.7.1
prompt-toolkit                 3.0.5
protobuf                       3.19.4
ptyprocess                     0.6.0
pyasn1                         0.4.8
pyasn1-modules                 0.2.8
pycosat                        0.6.3
pycparser                      2.20
Pygments                       2.6.1
pyOpenSSL                      21.0.0
pyparsing                      2.4.7
pyrsistent                     0.16.0
PySocks                        1.7.1
python-dateutil                2.8.0
python-json-logger             0.1.11
pytz                           2021.3
PyYAML                         6.0
pyzmq                          19.0.0
readme-renderer                29.0
requests                       2.23.0
requests-oauthlib              1.3.1
requests-toolbelt              0.9.1
rfc3986                        1.4.0
rsa                            4.8
ruamel-yaml-conda              0.15.80
ruamel.yaml                    0.16.10
ruamel.yaml.clib               0.2.0
s3transfer                     0.5.0
sagemaker-jupyter-server-tools 1.0
sagemaker-nb2kg                0.1
sagemaker-sharing              0.1
sagemaker-ui-proxy             3.21.1
scikit-learn                   0.24.2
scipy                          1.7.3
SecretStorage                  3.3.1
Send2Trash                     1.5.0
setuptools                     49.6.0.post20210108
simpervisor                    0.3
six                            1.14.0
smmap                          3.0.2
snorkel                        0.9.8
supervisor                     4.1.0
tensorboard                    2.6.0
tensorboard-data-server        0.6.1
tensorboard-plugin-wit         1.8.1
terminado                      0.8.3
testpath                       0.4.4
threadpoolctl                  3.1.0
torch                          1.10.2
tornado                        6.1
tqdm                           4.57.0
traitlets                      4.3.3
twine                          3.3.0
typing-extensions              4.0.1
urllib3                        1.25.9
wcwidth                        0.1.9
webencodings                   0.5.1
websockets                     9.1
Werkzeug                       2.0.3
wheel                          0.36.2
wrapt                          1.12.1
yarl                           1.4.2
zipp                           3.1.0

From here, it is clear that the snorkel package is installed on the system.

However, when I open a Jupypter notebook, I run

import os
os.chdir('../varun')

where os.getcwd() returns /root/p-ai-spring-2022/varun (from what I know root is the same as home/sagemaker-user in AWS but I could be wrong about that)

But running !pip list in the Jupyter notebook gives:

/opt/conda/lib/python3.7/site-packages/secretstorage/dhcrypto.py:16: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead
  from cryptography.utils import int_from_bytes
/opt/conda/lib/python3.7/site-packages/secretstorage/util.py:25: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead
  from cryptography.utils import int_from_bytes
Package                              Version
------------------------------------ -----------------
aiobotocore                          2.0.1
aiohttp                              3.8.1
aioitertools                         0.8.0
aiosignal                            1.2.0
alabaster                            0.7.12
anaconda-client                      1.7.2
anaconda-project                     0.8.3
argh                                 0.26.2
argon2-cffi                          21.3.0
argon2-cffi-bindings                 21.2.0
asn1crypto                           1.3.0
astroid                              2.9.0
astropy                              4.0
async-timeout                        4.0.1
asynctest                            0.13.0
atomicwrites                         1.3.0
attrs                                19.3.0
autopep8                             1.4.4
autovizwidget                        0.19.1
awscli                               1.22.23
Babel                                2.9.1
backcall                             0.1.0
backports.shutil-get-terminal-size   1.0.0
beautifulsoup4                       4.8.2
bitarray                             1.2.1
bkcharts                             0.2
bleach                               4.1.0
bokeh                                1.4.0
boto                                 2.49.0
boto3                                1.20.23
botocore                             1.22.8
Bottleneck                           1.3.2
brotlipy                             0.7.0
cached-property                      1.5.2
certifi                              2021.10.8
cffi                                 1.14.6
chardet                              3.0.4
charset-normalizer                   2.0.4
Click                                7.0
cloudpickle                          2.0.0
clyent                               1.2.2
colorama                             0.4.3
conda                                4.11.0
conda-package-handling               1.7.3
contextlib2                          0.6.0.post1
cryptography                         36.0.0
cycler                               0.10.0
Cython                               0.29.15
cytoolz                              0.10.1
dask                                 2021.12.0
decorator                            4.4.1
defusedxml                           0.6.0
diff-match-patch                     20181111
dill                                 0.3.4
distributed                          2021.12.0
distro                               1.6.0
docker                               5.0.0
docker-compose                       1.29.2
dockerpty                            0.4.1
docopt                               0.6.2
docutils                             0.15.2
dparse                               0.5.1
entrypoints                          0.3
et-xmlfile                           1.0.1
fastcache                            1.1.0
filelock                             3.0.12
flake8                               3.7.9
Flask                                1.1.1
frozenlist                           1.2.0
fsspec                               2021.11.1
future                               0.18.2
gevent                               1.4.0
glob2                                0.7
gmpy2                                2.0.8
google-pasta                         0.2.0
greenlet                             0.4.15
h5py                                 2.10.0
hdijupyterutils                      0.19.1
HeapDict                             1.0.1
html5lib                             1.0.1
hypothesis                           5.5.4
idna                                 2.8
imageio                              2.6.1
imagesize                            1.2.0
importlib-metadata                   1.5.0
imutils                              0.5.4
intervaltree                         3.0.2
ipykernel                            5.1.4
ipython                              7.12.0
ipython_genutils                     0.2.0
ipywidgets                           7.5.1
isort                                4.3.21
itsdangerous                         1.1.0
jdcal                                1.4.1
jedi                                 0.14.1
jeepney                              0.4.2
Jinja2                               3.0.3
jmespath                             0.10.0
joblib                               0.14.1
json5                                0.9.1
jsonschema                           3.2.0
jupyter                              1.0.0
jupyter-client                       5.3.4
jupyter-console                      6.1.0
jupyter-core                         4.6.1
jupyterlab                           1.2.21
jupyterlab-server                    1.0.6
keyring                              21.1.0
kiwisolver                           1.1.0
lazy-object-proxy                    1.4.3
libarchive-c                         2.8
lief                                 0.9.0
llvmlite                             0.37.0
locket                               0.2.0
lxml                                 4.6.4
MarkupSafe                           2.0.1
matplotlib                           3.1.3
mccabe                               0.6.1
mistune                              0.8.4
mkl-fft                              1.0.15
mkl-random                           1.1.0
mkl-service                          2.3.0
mock                                 4.0.1
more-itertools                       8.2.0
mpmath                               1.1.0
msgpack                              0.6.1
multidict                            5.2.0
multipledispatch                     0.6.0
multiprocess                         0.70.12.2
nbconvert                            5.6.1
nbformat                             5.0.4
nest-asyncio                         1.5.4
networkx                             2.4
nltk                                 3.4.5
nose                                 1.3.7
notebook                             6.4.6
numba                                0.54.1
numexpr                              2.7.1
numpy                                1.20.3
numpydoc                             0.9.2
olefile                              0.46
opencv-python-headless               4.5.5.62
openpyxl                             3.0.3
packaging                            20.1
pandas                               1.0.1
pandocfilters                        1.4.2
parso                                0.5.2
partd                                1.1.0
path                                 13.1.0
pathlib2                             2.3.5
pathos                               0.2.8
pathtools                            0.1.2
patsy                                0.5.1
pep8                                 1.7.1
pexpect                              4.8.0
pickleshare                          0.7.5
Pillow                               8.4.0
pip                                  21.3.1
pkginfo                              1.5.0.1
platformdirs                         2.4.0
plotly                               5.4.0
pluggy                               0.13.1
ply                                  3.11
pox                                  0.3.0
ppft                                 1.6.6.4
prometheus-client                    0.7.1
prompt-toolkit                       3.0.3
protobuf                             3.19.1
protobuf3-to-dict                    0.1.5
psutil                               5.6.7
ptyprocess                           0.6.0
pure-sasl                            0.6.2
py                                   1.11.0
pyarrow                              6.0.1
pyasn1                               0.4.8
pycodestyle                          2.5.0
pycosat                              0.6.3
pycparser                            2.19
pycrypto                             2.6.1
pycurl                               7.43.0.5
pydocstyle                           4.0.1
pyflakes                             2.1.1
pyfunctional                         1.4.3
Pygments                             2.5.2
PyHive                               0.6.4
pykerberos                           1.2.1
pylint                               2.12.2
pyodbc                               4.0.0-unsupported
pyOpenSSL                            19.1.0
pyparsing                            2.4.6
pyrsistent                           0.15.7
PySocks                              1.7.1
pytest                               5.3.5
pytest-arraydiff                     0.3
pytest-astropy                       0.8.0
pytest-astropy-header                0.1.2
pytest-doctestplus                   0.5.0
pytest-openfiles                     0.4.0
pytest-remotedata                    0.3.2
python-dateutil                      2.8.1
python-dotenv                        0.19.2
python-jsonrpc-server                0.3.4
python-language-server               0.31.7
pytz                                 2019.3
PyWavelets                           1.1.1
pyxdg                                0.26
PyYAML                               6.0
pyzmq                                18.1.1
QDarkStyle                           2.8
QtAwesome                            0.6.1
qtconsole                            4.6.0
QtPy                                 1.9.0
requests                             2.26.0
requests-kerberos                    0.12.0
rope                                 0.16.0
rsa                                  4.8
Rtree                                0.9.3
ruamel_yaml                          0.15.87
s3fs                                 2021.11.1
s3transfer                           0.5.0
sagemaker                            2.70.0
sagemaker-studio-analytics-extension 0.0.4
sagemaker-studio-sparkmagic-lib      0.1.3
sasl                                 0.2.1
scikit-image                         0.16.2
scikit-learn                         0.22.1
scipy                                1.4.1
seaborn                              0.10.0
SecretStorage                        3.1.2
Send2Trash                           1.8.0
setuptools                           59.5.0
simplegeneric                        0.8.1
singledispatch                       3.4.0.3
six                                  1.14.0
sklearn                              0.0
smclarify                            0.2
smdebug-rulesconfig                  1.0.1
snowballstemmer                      2.0.0
sortedcollections                    1.1.2
sortedcontainers                     2.1.0
soupsieve                            1.9.5
sparkmagic                           0.19.1
Sphinx                               2.4.0
sphinxcontrib-applehelp              1.0.1
sphinxcontrib-devhelp                1.0.1
sphinxcontrib-htmlhelp               1.0.2
sphinxcontrib-jsmath                 1.0.1
sphinxcontrib-qthelp                 1.0.2
sphinxcontrib-serializinghtml        1.1.3
sphinxcontrib-websupport             1.2.0
spyder                               4.0.1
spyder-kernels                       1.8.1
SQLAlchemy                           1.3.13
statsmodels                          0.11.0
sympy                                1.5.1
tables                               3.6.1
tabulate                             0.8.9
tblib                                1.6.0
tenacity                             8.0.1
terminado                            0.8.3
testpath                             0.4.4
texttable                            1.6.4
thrift                               0.13.0
thrift-sasl                          0.4.3
toml                                 0.10.2
toolz                                0.10.0
tornado                              6.1
tqdm                                 4.42.1
traitlets                            4.3.3
typed-ast                            1.5.1
typing_extensions                    4.0.1
ujson                                1.35
unicodecsv                           0.14.1
urllib3                              1.26.7
watchdog                             0.10.2
wcwidth                              0.1.8
webencodings                         0.5.1
websocket-client                     0.59.0
Werkzeug                             1.0.0
wheel                                0.34.2
widgetsnbextension                   3.5.1
wrapt                                1.11.2
wurlitzer                            2.0.0
xlrd                                 1.2.0
XlsxWriter                           1.2.7
xlwt                                 1.3.0
yapf                                 0.28.0
yarl                                 1.7.2
zict                                 1.0.0
zipp                                 2.2.0

Where snorkel is not installed.

Also I noticed that running sys.executable gives:

'/opt/conda/bin/python'

Whereas running sys.path gives:

['/root/p-ai-spring-2022/serena',
 '/opt/conda/lib/python37.zip',
 '/opt/conda/lib/python3.7',
 '/opt/conda/lib/python3.7/lib-dynload',
 '',
 '/opt/conda/lib/python3.7/site-packages',
 '/opt/conda/lib/python3.7/site-packages/IPython/extensions',
 '/root/.ipython']

Maybe this has something to do with it but I'm unsure. What can I do to troubleshoot?

mikeizbicki commented 2 years ago

It looks like you have multiple python environments installed, some of them with access to the snorkel lib and some without. Inside of the notebook, is there a reason you can't run !pip install snorkel to install it into whichever python environment you're using there?

mikeizbicki commented 2 years ago

PS. I'm glad to have you ask non-class related questions here! But since this isn't class-related, I'm closing the issue just so that it's not mixed in with assignment questions and potentially causing confusion. Feel free to continue the discussion in the closed issue, though.