Closed awskila closed 6 days ago
Thank you, we are taking a look and will get back to you shortly.
Hi @awskila,
Can you share some test code? I ran some test code using the current (2.15) production wheels and I was not able to reproduce your problem.
Hi @awskila Llama2-13B sample is made available with 2.16. Can you please try to see if that works or a specific code snippet to reproduce will help move this issue forward. Thanks! https://github.com/aws-neuron/aws-neuron-samples/blob/master/torch-neuronx/transformers-neuronx/inference/meta-llama-2-13b-sampling.ipynb
Closing the ticket. Please re-open if the issue persists
I am trying to save the Neuron model and deploy it to SageMaker as an endpoint. I noticed in the documentation, under serialization support, it is stated that all models can be loaded or saved except GPTJ and GPTNeoX model classes.
However, I tried several models, including Llama2-13b, OPT-30B, OPT-66B, and Llama2-70B, and none of these models can be saved using several methods.
1) I tried
<neuron_model>.save
, which doesn't exist. It only appears to exist for GPT2 models. 2) I tried<neuron_model>.state_dict()
, which fails on all LazyModules. 3) I triedtorch.save
or via torchscript usingtorch.jit.save
and then trying to use thestate_dict()
.Below is an example using OPT-66B.
Is there anything that can be done to fix this? I've tried the last five versions of
transformers-neuronx
see here. Please advise. Thanks!