aws-neuron / aws-neuron-sdk

Powering AWS purpose-built machine learning chips. Blazing fast and cost effective, natively integrated into PyTorch and TensorFlow and integrated with your favorite AWS services
https://aws.amazon.com/machine-learning/neuron/
Other
421 stars 136 forks source link

Trying to trace a model via torch_neuronx but getting issue: RuntimeError: The following operation failed in the TorchScript interpreter #835

Closed ramsum56 closed 3 months ago

ramsum56 commented 4 months ago

I am trying to turn a Bert Model into a neuron traced model but I am seeing this issue on the line I am trying to execute the model. Further down I am also seeing an issue: The PyTorch Neuron Runtime could not be initialized. Neuron Driver issues are logged. When running : lsmod | grep neuron I see output: neuron 253952 0

aws-donkrets commented 4 months ago

Hi ramsum56, Sorry you are having an issue. Can you provide the exact error or post a log file? Also, would be helpful to know what SDK version you are running. Providing output of the following commands would be helpful: $ yum list installed | grep neuron ## on AL2 $ apt list | grep neuron | grep installed ## on Ubuntu $ pip3 list | grep neuron ## both AL2 and Ubuntu $ cat /sys/devices/virtual/dmi/id/product_name ## both AL2 and Ubuntu

aws-donkrets commented 4 months ago

ramsum56, You can also try tracing with torch.jit.trace.

mrnikwaws commented 3 months ago

Hi @ramsum56 - closing since we have not heard back from you in two weeks. Please re-open or create another ticket with the requested in if you are still seeing issues