Open adekunleoajayi opened 11 months ago
@adekunleoajayi can you please share your installed version and environment information?
@philschmid, Here are the environment information and versions
OS: Linux
Python: 3.8
optimum-neuron:
1. pip install "git+https://github.com/huggingface/optimum-neuron.git@b94d534cc0160f1e199fae6ae3a1c7b804b49e30" --upgrade (version: 0.0.7.dev0)
2. pip install git+https://github.com/huggingface/optimum-neuron.git (version: 0.0.8.dev0)
3. pip install optimum-neuron (version: 0.0.7)
4. pip install optimum[neuron] (version: 0.0.1)
The first three gave the same error
usage: optimum-cli export [-h] {onnx,tflite} ...
optimum-cli export: error: invalid choice: 'neuron' (choose from 'onnx', 'tflite')
while the last one returned this WARNING: optimum-neuron 0.0.1 does not provide the extra 'neuron'
during installation and does not even contain the optimum-cli.
Hi @adekunleoajayi, I was running into the same issue. I think this is something with the development version of Optimum Neuron. I tried using the HF AMI as described here https://www.philschmid.de/setup-aws-trainium and then the optimum-cli
works as expected. Again, when I install the development version inside an instance created on that AMI optimum-cli
simply stops working.
Here is a dump of pip freeze on that AMI (hope it helps):
absl-py==1.4.0
accelerate==0.20.3
aiohttp==3.8.4
aiosignal==1.3.1
anyio==3.7.0
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
asttokens==2.2.1
async-timeout==4.0.2
attrs==21.2.0
Automat==20.2.0
aws-neuronx-runtime-discovery==2.9
Babel==2.8.0
backcall==0.2.0
bcrypt==3.2.0
beautifulsoup4==4.12.2
bleach==6.0.0
blinker==1.4
boto3==1.27.0
botocore==1.30.0
cachetools==5.3.1
certifi==2020.6.20
cffi==1.15.1
chardet==4.0.0
charset-normalizer==3.1.0
click==8.0.3
cloud-init==23.1.2
cloud-tpu-client==0.10
colorama==0.4.4
coloredlogs==15.0.1
comm==0.1.3
command-not-found==0.3
configobj==5.0.6
constantly==15.1.0
cryptography==3.4.8
datasets==2.13.0
dbus-python==1.2.18
debugpy==1.6.7
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.6
distlib==0.3.4
distro==1.7.0
distro-info===1.1build1
docutils==0.20.1
ec2-hibinit-agent==1.0.0
ec2-metadata==2.10.0
evaluate==0.4.0
exceptiongroup==1.1.2
executing==1.2.0
fastjsonschema==2.17.1
filelock==3.6.0
frozenlist==1.3.3
fsspec==2023.6.0
google-api-core==1.34.0
google-api-python-client==1.8.0
google-auth==2.21.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==1.0.0
googleapis-common-protos==1.59.1
grpcio==1.56.0
hibagent==1.0.1
httplib2==0.20.2
huggingface-hub==0.16.2
humanfriendly==10.0
hyperlink==21.0.0
idna==3.3
importlib-metadata==4.6.4
incremental==21.3.0
ipykernel==6.24.0
ipython==8.14.0
ipython-genutils==0.2.0
islpy==2023.1
jedi==0.18.2
jeepney==0.7.1
Jinja2==3.0.3
jmespath==1.0.1
joblib==1.3.1
jsonpatch==1.32
jsonpointer==2.0
jsonschema==3.2.0
jupyter-events==0.6.3
jupyter_client==8.3.0
jupyter_core==5.3.1
jupyter_server==2.7.0
jupyter_server_terminals==0.4.4
jupyterlab-pygments==0.2.2
keyring==23.5.0
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
libneuronxla==0.5.326
lockfile==0.12.2
Markdown==3.4.3
MarkupSafe==2.1.1
matplotlib-inline==0.1.6
mistune==3.0.1
more-itertools==8.10.0
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.14
nbclassic==1.0.0
nbclient==0.8.0
nbconvert==7.6.0
nbformat==5.9.0
nest-asyncio==1.5.6
netifaces==0.11.0
networkx==2.6.3
neuronx-cc==2.7.0.40+f7c6cf2a3
neuronx-hwm==2.7.0.3+0092b9d34
notebook==6.5.4
notebook_shim==0.2.3
numpy==1.21.6
nvidia-cublas-cu11==11.10.3.66
nvidia-cuda-nvrtc-cu11==11.7.99
nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
oauth2client==4.1.3
oauthlib==3.2.0
optimum==1.9.0
optimum-neuron==0.0.7
overrides==7.3.1
packaging==23.1
pandas==2.0.3
pandocfilters==1.5.0
parso==0.8.3
pexpect==4.8.0
pgzip==0.3.4
pickleshare==0.7.5
Pillow==10.0.0
platformdirs==2.5.1
prometheus-client==0.17.0
prompt-toolkit==3.0.39
protobuf==3.20.2
psutil==5.9.5
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==12.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.1
pycparser==2.21
Pygments==2.15.1
PyGObject==3.42.1
PyHamcrest==2.0.2
PyJWT==2.3.0
pyOpenSSL==21.0.0
pyparsing==2.4.7
pyrsistent==0.18.1
pyserial==3.5
python-apt==2.4.0+ubuntu1
python-daemon==3.0.1
python-dateutil==2.8.2
python-debian===0.1.43ubuntu1
python-json-logger==2.0.7
python-magic==0.4.24
pytz==2022.1
PyYAML==5.4.1
pyzmq==25.1.0
regex==2023.6.3
requests==2.28.2
requests-oauthlib==1.3.1
requests-unixsocket==0.3.0
responses==0.18.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rsa==4.9
s3transfer==0.6.1
safetensors==0.3.1
scikit-learn==1.3.0
scipy==1.7.3
SecretStorage==3.3.1
Send2Trash==1.8.2
sentencepiece==0.1.99
service-identity==18.1.0
six==1.16.0
sniffio==1.3.0
sos==4.4
soupsieve==2.4.1
ssh-import-id==5.11
stack-data==0.6.2
sympy==1.12
systemd-python==234
tensorboard==2.13.0
tensorboard-data-server==0.7.1
tensorboard-plugin-neuronx==2.5.37.0
terminado==0.17.1
threadpoolctl==3.1.0
tinycss2==1.2.1
tokenizers==0.13.3
torch==1.13.1
torch-neuronx==1.13.1.1.8.0
torch-xla==1.13.1+torchneuron7
torchvision==0.14.1
tornado==6.3.2
tqdm==4.65.0
traitlets==5.9.0
transformers==4.30.2
Twisted==22.1.0
typing_extensions==4.7.1
tzdata==2023.3
ubuntu-advantage-tools==8001
ufw==0.36.1
unattended-upgrades==0.1
uritemplate==3.0.1
urllib3==1.26.5
virtualenv==20.13.0+ds
wadllib==1.3.6
wcwidth==0.2.6
webencodings==0.5.1
websocket-client==1.6.1
Werkzeug==2.3.6
xxhash==3.2.0
yarl==1.9.2
zipp==1.0.0
zope.interface==5.4.0
I'm running in to the same error message if I install optimum[neuronx] on a regular linux machine. It seems to work if I install it on an inferentia2 aws instance. It seems that the hardware is required to do the conversion. Is it possible to convert the model without the chip somehow?
Hi @adekunleoajayi, I am not able to reproduce the issue that you met with optimum-neuron==0.0.7
:
(aws_neuron_venv_2.8) ubuntu@ip-xxx-xx-xx-xx:~$ optimum-cli export neuron --model yiyanghkust/finbert-tone --sequence_length 128 --batch_size 1 test_issue/
Validating Neuron model...
- Validating Neuron Model output "logits":
-[✓] (1, 3) matches (1, 3)
-[✓] all values close (atol: 0.0001)
The Neuronx export succeeded and the exported model was saved at: .
From the error log, it seems that the neuron backend was not registered. Do you have all neuronx dependencies installed? Could you send me the output of following commands so that I can see your setup?
pip3 list | grep -e neuron -e xla -e torch
apt list --installed | grep aws-neuron
Here below are the versions that I tested to exporter your checkpoint:
aws-neuronx-runtime-discovery 2.9
libneuronxla 0.5.391
neuronx-cc 2.8.0.25+a3ad0f342
neuronx-distributed 0.2.0
neuronx-hwm 2.8.0.3+2b7c6da39
optimum-neuron 0.0.7
torch 1.13.1
torch-neuronx 1.13.1.1.9.0
torch-xla 1.13.1+torchneuron8
torchvision 0.14.1
transformers-neuronx 0.5.58
Driver:
aws-neuronx-collectives/unknown,now 2.15.13.0-db4e2d9a9 amd64 [installed,upgradable to: 2.15.16.0-db4e2d9a9]
aws-neuronx-dkms/unknown,now 2.11.9.0 amd64 [installed]
aws-neuronx-runtime-lib/unknown,now 2.15.11.0-f168cb23b amd64 [installed,upgradable to: 2.15.14.0-279f319f2]
aws-neuronx-tools/unknown,now 2.12.2.0 amd64 [installed]
Hi @evellasques, the dev branch of optimum-neuron
was not stable enough. Last week I improve our inference CI #168 , it should be better in future. So if you want to test with dev branch for latest features, the best practice would be to check if the current CIs are green for the features that you plan to use.
@vishvananda , sure! You can compile your model on CPU only instance and then run on INF2 / INF1. Just you need to ensure that you:
optimum-neuron[neuronx]
@adekunleoajayi and @evellasques I took the yiyanghkust/finbert-tone
checkpoint as an example:
optimum-cli export neuron --model yiyanghkust/finbert-tone --disable-validation --sequence_length 128 --batch_size 1 test_issue/
Remove --disable-validation
for validation if you have neuron devices on your instance.
Configure compilation args for better latency -> I would recommend start with auto_cast matmul with auto_cast_type bf16
The compiled artifacts can be found here: Jingya/finbert-tone
from transformers import AutoTokenizer
from optimum.neuron import NeuronModelForSequenceClassification
model = NeuronModelForSequenceClassification.from_pretrained("Jingya/finbert-tone")
tokenizer = AutoTokenizer.from_pretrained("yiyanghkust/finbert-tone")
inputs = tokenizer("there is a shortage of capital, and we need extra financing", return_tensors="pt")
logits = model(**inputs).logits
print(model.config.id2label[logits.argmax().item()])
# 'Negative'
Hey folks, any other questions on this particular issue. I will close this as it shall be completed with the recent 0.0.10 release. Feel free to reopen the issue if there is any further question.
@JingyaHuang Thank you for your explanations !
I tried many optimum-neuron versions, no one worked for me, every time I try to convert the model to AWS neuron, I get this error:
usage: optimum-cli export [-h] {onnx,tflite} ... optimum-cli export: error: invalid choice: 'neuron' (choose from 'onnx', 'tflite')
Here are some info about my environment:
Instance: inf2.8xlarge
OS: Linux
Distribution: Ubuntu 20.04
AMI: Deep Learning AMI GPU PyTorch 1.13.1 (Ubuntu 20.04) 20231103
Also, you can check the following screenshots:
Is the HF AMI necessary for model conversion ?
Hi @AhmedAl93, Hugging Face Neuron Deep Learning AMI (Ubuntu 22.04) is the recommended way to use optimum-neuron
via ec2 instances, since every dependency is configured and tested. And we recommend to use the latest one as well. I saw that you are using a GPU AMI, could you try the Neuron AMI?
If you have any subscription issues, and need to continue with your current instance, I would suggest you install the latest version of AWS Neuron SDK(2.17) and optimum-neuron (from the screenshot, you are using v0.0.7 which is quite old and might not be able to support Mistral that you are testing with).
Btw, @dacorvo submitted a new AMI with optimum-neuron v0.0.20 which came out last week, with better support llm like Mistral, it will come out very soon.
@JingyaHuang Thank you for your response :) I tried a Neuron AMI (Deep Learning AMI Neuron PyTorch 1.13 (Ubuntu 20.04)), inf2.8xlarge instance, but still not working ! As mentioned in my previous comment, I used different optimum-neuron versions (0.0.20, 0.0.19, ...). for optimum-neuron v0.0.20: for optimum-neuron v0.0.19: For the new AMI, that will be really helpful, I will gladly test it when it's available
I'm seeing the same issue with the newest AMI (huggingface-neuron-2024-03-18T07-48-01Z-692efe1a-8d5c-4033-bcbc-5d99f2d4ae6a)
> optimum-cli neuron
usage: optimum-cli
Optimum CLI tool: error: invalid choice: 'neuron' (choose from 'export', 'env', 'onnxruntime')
optimum
version: 1.17.1transformers
version: 4.36.2Hi @brianloyal and @AhmedAl93, thanks for trying and reporting. I will try to reproduce and let you know.
@brianloyal I reproduced your issue with our latest AMI. cc @philschmid @shub-kris
Updating optimum
then reinstalling optimum-neuron
solved the issue:
$ python -m pip install -U optimum
$ python -m pip install optimum-neuron==0.0.20
Hi @AhmedAl93 and @brianloyal, sorry for the late reply. We just released a new DLAMI(20240409) which should solve the issue, our AMI creation pipeline did not correctly install the package leading to the issue, and now the AMI creation is fixed and we will ensure the functionalities of our DLAMI in future releases. Thanks for your patience, please feel free to try out and ping me if there is any further issue. THX!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you!
I am following this tutorial on how to deploy LLM to AWS inferentia2.
When I try to convert the model to AWS neuron using the optimum-cli
optimum-cli export neuron --model yiyanghkust/finbert-tone --sequence_length 128 --batch_size 1 tmp/
I get the following error