Open sol0invictus opened 5 months ago
Hi @sol0invictus,
Thank you for the code reference! We reproduced the problem here and are intending to release a fix for this example in the upcoming release
As an immediate solution, you can either downgrade the transformers
version or update the code to handle the layer_idx
argument.
https://github.com/aws-neuron/neuronx-distributed/blob/a80091de6c9d8eb75f96a7367e143a81d586fbbc/examples/inference/llama2/neuron_modeling_llama.py#L36
The llama inference example needs to be updated because transformers==4.36 now needs an additional argument
layer_idx
in the LlamaDecoderLayer class. https://github.com/huggingface/transformers/blob/v4.37.0/src/transformers/models/llama/modeling_llama.py#L754