Closed lvnair3 closed 10 months ago
This issue was resolved once all the steps given on the Neuron-SDK here are followed (I had a missed a few of them earlier): https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/neuron-setup/pytorch/neuronx/ubuntu/torch-neuronx-ubuntu20.html.
Thank you Lakshmi for reporting this. While we are trying to repro the issue on our end, could you try without using --auto-cast-type', 'fp8_e4m3' option to cast the model to fp8_e4m3 type. By default, we cast to bf16. You could also try with auto-cast-type', 'fp16' and see if it unblocks you.
Glad you were able to resolve your issue
Closing this issue since it appears to be resolved
Task
OPT 1.3B inference on Wikitext2 using E4M3 on Trainium
Trn1
Inference Script
Full script is attached script.zip. Essentially, it is an adaptation of the
run_clm_no_trainer.py
script from HuggingFace here.I've adapted the script to only perform inference and no training. I also adapted it to include the following block of code for NeuronX:
Run command
NOTE: The model
lnair/opt-1.3b-wikitext2
is a fine-tuned version offacebook/opt-1.3b
(no architectural changes here). Nevertheless, it fails on both thelnair/opt-1.3b-wikitext2
andfacebook/opt-1.3b
checkpoints.Error
This script for OPT 1.3B fails with the following error:
Thanks in advance!