Closed sidyakinian closed 2 years ago
your env is not set up proper. follow the instructions here https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-intro/pytorch-setup/pytorch-quickstart.html#pytorch-quickstart
Hi sidyakinian,
I was able to allocate a ml.t3.medium and open JupyterLab 1. I then used JupyterLab to open a terminal and ran:
source activate python3
pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com/
pip install torch-neuron==1.9.1.* neuron-cc[tensorflow]
pip install sagemaker>=2.79.0 transformers==4.12.3 --upgrade
This worked without the issue you encountered. When trying to run the commands inside the notebook directly as in the tutorial, the process was Killed , possibly b/c the instance size is too small.
I was not able to use the pytorch_p38
environment, which I believe you may have activated by accident based on this line in your post:
~/anaconda3/envs/pytorch_p38/lib/python3.8/site-packages/torch_neuron/convert.py
Please note that this environment is not the one mentioned in the tutorial you linked. The tutorial uses the python3
conda env. When trying to run the commands using the pytorch_p38
env, the process is Killed for me either inside a notebook or a terminal.
My suggestion is to retry using the steps I mentioned above, and if you encounter memory issues please try a larger instance size.
@aws-stdun Thank you, your suggestion works! Seems like a different conda env was the problem. Getting a TF import error now, but that's a different issue.
@aws-stdun Sorry to say the issue persists. I've done the following:
ml.m4.2xlarge
.conda_python3
kernel.1.23.1
and removing duplicate versions of numpy.After all this, the same error appears.
It looks like aten::embedding
operation isn't supported. import torch.neuron; print(*torch.neuron.get_supported_operations(), sep='\n')
indeed doesn't list aten::embedding
, though it lists aten::embedding_renorm_
. I've checked release notes for torch.neuron
supported operations, seems like aten::embedding
was added in v1.0.763.0
but then promptly removed in v1.0.1001.0
because it didn't meet performance criteria.
Also, could this be related to #410?
I'm getting the same issue with the Neuron SDK PyTorch tutorial on conda_python3
kernel and ml.c5.4xlarge
Sagemaker notebook instance.
I've tried reinstalling neuron-cc
as advised by Please check that neuron-cc is installed and working properly
.
Downgrading to Python 3.7.13 and compiling on Google Colab doesn't work either.
Hi @sidyakinian,
Based on your error messages above, it doesn't look like you've installed the compiler (neuron-cc
).
The environment setup instructions from the tutorial you shared have this line, which is required:
!pip install torch-neuron==1.9.1.* neuron-cc[tensorflow] sagemaker>=2.79.0 transformers==4.12.3 --upgrade
If you believe your env is setup correctly and you are still receiving this error, can you share the output of:
!pip list | grep neuron
With respect to your comment prior to that about aten:embedding
not being supported:
Unsupported operators will be placed on CPU. In this model, torch-neuron
will partition the embedding onto CPU, which is expected here and will not sig. impact the performance of Inferentia on this model.
@aws-stdun Output of !pip list | grep neuron
:
neuron-cc 1.0
torch-neuron 1.9.1.2.3.0.0
Hello @sidyakinian ,
It appears that you may have accidentally downloaded neuron-cc
from Pypi (https://pypi.org/project/neuron-cc/) and not our repository (https://pip.repos.neuron.amazonaws.com). Can you try re-installing with python -m pip install --extra-index-url https://pip.repos.neuron.amazonaws.com --force-reinstall neuron-cc
and try again?
@aws-taylor Thank you, I do seem to have had the wrong version! New versions of neuron-cc
and torch-neuron
:
neuron-cc 1.11.7.0+aec18907e
torch-neuron 1.9.1.2.3.0.0
Unfortunately, still doesn't solve the issue.
I've also tried updating to torch-neuron 1.11.0.2.3.0.0
but still get the same error and stack trace.
Hi @sidyakinian,
It looks like you may have installed Tensorflow 2 into your environment, which is not compatible with torch-neuron
.
Can you try pip install tensorflow==1.15
and rerun compilation?
Hi @sidyakinian, please reopen this ticket if you will need further assistance. Thanks!
I'm following exactly this Hugging Face tutorial (also posted here), and running into runtime error:
No operations were successfully partitioned and compiled to neuron for this model - aborting trace!
.Notebook code
Packages
I'm using Amazon Sagemaker
ml.t3.medium
notebook instance.Output of `pip list`
``` Package Version ---------------------------------- --------------------–––`` absl-py 1.2.0 aiobotocore 2.0.1 aiohttp 3.8.1 aioitertools 0.8.0 aiosignal 1.2.0 alabaster 0.7.12 anaconda-client 1.8.0 anaconda-project 0.10.2 anyio 3.4.0 appdirs 1.4.4 argh 0.26.2 argon2-cffi 21.1.0 arrow 1.2.1 asn1crypto 1.4.0 astroid 2.8.6 astropy 5.0 astunparse 1.6.3 async-generator 1.10 async-timeout 4.0.1 atomicwrites 1.4.0 attrs 21.2.0 autopep8 1.5.6 autovizwidget 0.19.1 awscli 1.25.38 Babel 2.9.1 backcall 0.2.0 backports.functools-lru-cache 1.6.4 backports.shutil-get-terminal-size 1.0.0 bcrypt 3.2.2 beautifulsoup4 4.10.0 binaryornot 0.4.4 bitarray 2.3.4 bkcharts 0.2 black 21.11b0 bleach 4.1.0 blis 0.7.6 bokeh 2.4.2 boto 2.49.0 boto3 1.24.38 botocore 1.27.44 Bottleneck 1.3.2 brotlipy 0.7.0 cached-property 1.5.2 cachetools 5.2.0 captum 0.4.1 catalogue 2.0.6 certifi 2021.10.8 cffi 1.15.0 chardet 4.0.0 charset-normalizer 2.0.7 click 8.0.3 cloudpickle 2.0.0 clyent 1.2.2 colorama 0.4.3 conda-pack 0.6.0 contextlib2 21.6.0 cookiecutter 1.7.0 coverage 6.3.2 cryptography 36.0.0 cycler 0.11.0 cymem 2.0.6 Cython 0.29.24 cytoolz 0.11.2 dask 2021.11.2 debugpy 1.5.1 decorator 5.1.0 defusedxml 0.7.1 diff-match-patch 20200713 dill 0.3.4 distributed 2021.11.2 distro 1.7.0 dmlc-nnvm 1.11.0.0+0 dmlc-topi 1.11.0.0+0 dmlc-tvm 1.11.0.0+0 docker 5.0.3 docker-compose 1.29.2 dockerpty 0.4.1 docopt 0.6.2 docutils 0.15.2 dparse 0.5.1 entrypoints 0.3 environment-kernels 1.1.1 et-xmlfile 1.0.1 fastai 2.1.10 fastcache 1.1.0 fastcore 1.3.29 fastprogress 1.0.2 filelock 3.4.0 flake8 3.8.4 Flask 2.0.2 Flask-Cors 3.0.10 flatbuffers 1.12 fonttools 4.28.2 frozenlist 1.2.0 fsspec 2021.11.1 future 0.18.2 gast 0.4.0 gevent 21.8.0 glob2 0.7 gmpy2 2.1.0rc1 google-auth 2.9.1 google-auth-oauthlib 0.4.6 google-pasta 0.2.0 greenlet 1.1.2 grpcio 1.47.0 gssapi 1.7.3 h5py 3.4.0 hdijupyterutils 0.19.1 HeapDict 1.0.1 horovod 0.23.0 html5lib 1.1 huggingface-hub 0.8.1 idna 3.1 imagecodecs 2021.11.20 imageio 2.9.0 imagesize 1.3.0 importlib-metadata 4.8.2 importlib-resources 5.4.0 inferentia-hwm 1.11.0.0+0 inflection 0.5.1 iniconfig 1.1.1 intervaltree 3.0.2 ipykernel 6.5.0 ipyparallel 8.0.0 ipython 7.32.0 ipython-genutils 0.2.0 ipywidgets 7.6.5 islpy 2021.1+aws2021.x.16.0.bld0 isort 5.10.1 itsdangerous 2.0.1 jdcal 1.4.1 jedi 0.17.2 jeepney 0.7.1 Jinja2 3.0.3 jinja2-time 0.2.0 jmespath 0.10.0 joblib 1.1.0 json5 0.9.5 jsonschema 3.2.0 jupyter 1.0.0 jupyter-client 7.1.0 jupyter-console 6.4.0 jupyter-core 4.9.1 jupyter-server 1.12.0 jupyterlab 3.2.4 jupyterlab-pygments 0.1.2 jupyterlab-server 2.8.2 jupyterlab-widgets 1.0.2 keras 2.9.0 Keras-Preprocessing 1.1.2 keyring 23.2.1 kiwisolver 1.3.2 krb5 0.3.0 langcodes 3.3.0 lazy-object-proxy 1.6.0 libarchive-c 3.1 libclang 14.0.6 llvmlite 0.36.0 locket 0.2.0 lxml 4.8.0 Markdown 3.4.1 MarkupSafe 2.0.1 matplotlib 3.5.0 matplotlib-inline 0.1.3 mccabe 0.6.1 mistune 0.8.4 mkl-fft 1.3.1 mkl-random 1.2.2 mkl-service 2.4.0 mock 4.0.3 more-itertools 8.12.0 mpi4py 3.0.3 mpmath 1.2.1 msgpack 1.0.3 multidict 5.2.0 multipledispatch 0.6.0 multiprocess 0.70.12.2 munkres 1.1.4 murmurhash 1.0.6 mypy-extensions 0.4.3 nb-conda 2.2.1 nb-conda-kernels 2.3.1 nbclassic 0.3.4 nbclient 0.5.9 nbconvert 6.3.0 nbformat 5.1.3 nest-asyncio 1.5.1 networkx 2.4 neuron-cc 1.0 nltk 3.6.5 nose 1.3.7 notebook 6.4.6 numba 0.53.1 numexpr 2.7.3 numpy 1.20.0 numpydoc 1.1.0 oauthlib 3.2.0 olefile 0.46 onnx 1.10.2 opencv-python 4.5.1.48 openpyxl 3.0.9 opt-einsum 3.3.0 packaging 21.3 pandas 1.3.4 pandocfilters 1.5.0 paramiko 2.11.0 parso 0.7.0 partd 1.2.0 path 16.2.0 pathlib2 2.3.6 pathos 0.2.8 pathspec 0.9.0 pathtools 0.1.2 pathy 0.6.1 patsy 0.5.2 pep8 1.7.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.0.1 pip 22.0.4 pkginfo 1.8.1 platformdirs 2.3.0 plotly 5.6.0 pluggy 1.0.0 ply 3.11 pooch 1.5.2 pox 0.3.0 poyo 0.5.0 ppft 1.6.6.4 preshed 3.0.6 prometheus-client 0.12.0 prompt-toolkit 3.0.22 protobuf 3.19.4 protobuf3-to-dict 0.1.5 psutil 5.8.0 psycopg2 2.9.2 ptyprocess 0.7.0 py 1.11.0 py4j 0.10.9 pyarrow 7.0.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycodestyle 2.6.0 pycosat 0.6.3 pycparser 2.21 pycurl 7.44.1 pydantic 1.8.2 pydocstyle 6.1.1 pyerfa 2.0.0.1 pyflakes 2.2.0 pyfunctional 1.4.3 pygal 2.4.0 Pygments 2.10.0 pyinstrument 3.4.2 pyinstrument-cext 0.2.4 pykerberos 1.2.1 pylint 2.11.1 pyls-black 0.4.6 pyls-spyder 0.3.2 PyNaCl 1.5.0 pynvml 8.0.4 pyodbc 4.0.32 pyOpenSSL 21.0.0 pyparsing 3.0.6 PyQt5 5.12.3 PyQt5_sip 4.19.18 PyQtChart 5.12 PyQtWebEngine 5.12.1 pyrsistent 0.18.0 PySocks 1.7.1 pyspark 3.0.0 pyspnego 0.5.0 pytest 6.2.5 python-dateutil 2.8.2 python-dotenv 0.20.0 python-jsonrpc-server 0.4.0 python-language-server 0.36.2 pytz 2021.3 PyWavelets 1.2.0 pyxdg 0.27 PyYAML 5.4.1 pyzmq 22.3.0 QDarkStyle 3.0.2 qstylizer 0.2.1 QtAwesome 1.1.0 qtconsole 5.2.1 QtPy 1.11.2 regex 2021.11.10 requests 2.26.0 requests-kerberos 0.14.0 requests-oauthlib 1.3.1 rope 0.22.0 rsa 4.7.2 Rtree 0.9.7 ruamel-yaml-conda 0.15.80 s3fs 0.4.0 s3transfer 0.6.0 sacremoses 0.0.53 safety 1.10.3 sagemaker 2.101.1 sagemaker-pyspark 1.4.2 scikit-image 0.18.3 scikit-learn 1.0.1 scipy 1.4.1 seaborn 0.11.2 SecretStorage 3.3.1 Send2Trash 1.8.0 setuptools 59.2.0 shap 0.40.0 simplegeneric 0.8.1 singledispatch 0.0.0 sip 4.19.25 six 1.16.0 sklearn 0.0 slicer 0.0.7 smart-open 5.2.1 smclarify 0.2 smdebug 1.0.12 smdebug-rulesconfig 1.0.1 sniffio 1.2.0 snowballstemmer 2.2.0 sortedcollections 2.1.0 sortedcontainers 2.4.0 soupsieve 2.3 spacy 3.2.3 spacy-legacy 3.0.9 spacy-loggers 1.0.1 sparkmagic 0.15.0 Sphinx 4.3.0 sphinxcontrib-applehelp 1.0.2 sphinxcontrib-devhelp 1.0.2 sphinxcontrib-htmlhelp 2.0.0 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.3 sphinxcontrib-serializinghtml 1.1.5 sphinxcontrib-websupport 1.2.4 spyder 5.0.5 spyder-kernels 2.0.5 SQLAlchemy 1.4.27 srsly 2.4.2 statsmodels 0.13.1 sympy 1.9 tables 3.6.1 tabulate 0.8.9 tblib 1.7.0 tenacity 8.0.1 tensorboard 2.9.1 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorflow 2.9.1 tensorflow-estimator 2.9.0 tensorflow-io-gcs-filesystem 0.26.0 termcolor 1.1.0 terminado 0.12.1 testpath 0.5.0 textdistance 4.2.2 texttable 1.6.4 thinc 8.0.15 threadpoolctl 3.0.0 three-merge 0.1.1 tifffile 2021.11.2 tinycss2 1.1.1 tokenizers 0.10.3 toml 0.10.2 tomli 1.2.2 toolz 0.11.2 torch 1.9.1 torch-model-archiver 0.5.0b20211117 torch-neuron 1.9.1.2.3.0.0 torch-workflow-archiver 0.2.0b20211118 torchaudio 0.10.0 torchserve 0.5.0b20211117 torchtext 0.11.0 torchvision 0.11.1 tornado 6.1 tqdm 4.62.3 traitlets 5.1.1 transformers 4.12.3 typed-ast 1.5.0 typer 0.4.0 typing_extensions 4.0.0 ujson 4.2.0 unicodecsv 0.14.1 unicodedata2 13.0.0.post2 urllib3 1.26.8 wasabi 0.9.0 watchdog 2.1.6 wcwidth 0.2.5 webencodings 0.5.1 websocket-client 0.59.0 Werkzeug 2.0.3 wheel 0.37.0 whichcraft 0.6.1 widgetsnbextension 3.5.2 wrapt 1.13.3 wurlitzer 3.0.2 xlrd 2.0.1 XlsxWriter 3.0.2 xlwt 1.3.0 yapf 0.31.0 yarl 1.7.2 zict 2.0.0 zipp 3.6.0 zope.event 4.5.0 zope.interface 5.4.0 ```