Closed massi-ang closed 1 month ago
The notebook instructions state:
After gaining access to the model checkpoints, you should be able to use the already converted checkpoints.
This does not seem to be the case. I tried running python src/transformers/models/llama/convert_llama_weights_to_hf.py
on the downloaded folder but the process quits after few seconds with
Killed
Running on inf2.xlarge
with 256Gb EBS
The instruction in the notebook state:
Follow the steps described in meta-llama/Llama-2-13b to get access to the Llama 2 model from Meta and download the weights and tokenizer.
The get the ready to use models, one need to get the -hf
versions, ie Llama-2-13b-hf
Update the notebook instructions accordingly
Thank you for reporting the issue. After looking into the issues, there are two things that needs attention:
You are trying to load a 13B model on inf.xl which does not have enough host/device memory to load/run the model. Hence the program gets killed. This tutorial requires one to run on inf2.48xl or a trn1.32xl.
Where is that written?
Regarding the config.json, it looks like you may have downloaded the wrong artifacts. Please check the download model section in the notebook where it points to the HF repo from which the model needs to be downloaded
The notebook points to the normal version and not the HF version. Can you point the line of the notebook where this is mentioned?
There is a line in the tutorial now that specifies the instance to use
This Jupyter Notebook can be run on an Inf2 instance (inf2.48xlarge) or Trn1 instance (trn1.32xlarge).
It points to the HF link where there are instructions on how to download the model: https://huggingface.co/meta-llama/Llama-2-13b
Closing the issue now. Please re-open if the issue still exists.
Trying to execute: https://github.com/aws-neuron/aws-neuron-samples/blob/master/torch-neuronx/transformers-neuronx/inference/meta-llama-2-13b-sampling.ipynb
After cloning the LLama-13-b repo from Huggingface, I get the following content
config.json
is missing the the code is complaining about it.