meta-llama / llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
11.65k stars 1.66k forks source link

AMD GPU stuck at model.eval( ) when following the Quick Start Jupyter Notebook #278

Closed PatchouliPatch closed 10 months ago

PatchouliPatch commented 10 months ago

System Info

Pytorch: 2.1.0+rocm5.6 ROCm: 5.7.1 GPU: Sapphire RX 7900 XTX Python: 3.10.12 OS: Ubuntu 22.04.03

Information

šŸ› Describe the bug

CODE import torch from transformers import LlamaForCausalLM, LlamaTokenizer

model_id = "meta-llama/Llama-2-7b-hf"

tokenizer = LlamaTokenizer.from_pretrained(model_id)

model = LlamaForCausalLM.from_pretrained(model_id, device_map='auto', torch_dtype=)

print("Loading dataset.")

from llama_recipes.utils.dataset_utils import get_preprocessed_dataset from llama_recipes.configs.datasets import samsum_dataset

train_dataset = get_preprocessed_dataset(tokenizer, samsum_dataset, 'train')

eval_prompt = """ Summarize this dialog: A: Hi Tom, are you busy tomorrowā€™s afternoon? B: Iā€™m pretty sure I am. Whatā€™s up? A: Can you go with me to the animal shelter?. B: What do you want to do? A: I want to get a puppy for my son. B: That will make him so happy. A: Yeah, weā€™ve discussed it many times. I think heā€™s ready now. B: Thatā€™s good. Raising a dog is a tough issue. Like having a baby ;-) A: I'll get him one of those little dogs. B: One that won't grow up too big;-) A: And eat too much;-)) B: Do you know which one he would like? A: Oh, yes, I took him there last Monday. He showed me one that he really liked. B: I bet you had to drag him away. A: He wanted to take it home right away ;-). B: I wonder what he'll name it. A: He said heā€™d name it after his dead hamster ā€“ Lemmy - he's a great Motorhead fan :-)))

Summary: """

print("loading tokenizer") model_input = tokenizer(eval_prompt, return_tensors="pt").to("cuda")

print("evaluating...") model.eval() with torch.no_grad(): print(tokenizer.decode(model.generate(**model_input, max_new_tokens=100)[0], skip_special_tokens=True)) print("evaluation complete.")

CODE END

For some reason, the evaluation never finishes whatsoever. Note that the only difference my script has with the official Llama 2 Jupyter Notebook quick start guide is that I did not use load_in_8bit=True as an option as AMD GPUs are not supported in bitsandbytes.

image As seen in the image, the model is loaded into the GPU but it seems like something is preventing it from being ran properly. GPU utilization is at 100% but the memory clock is seemingly too low for it to be right.

Error logs

No error logs as the program is stuck at model.eval( ), but if attempting to end the program with ctr+C it fails. Closing the terminal running it crashes the whole computer, needing a reboot.

Expected behavior

I expect the model to give me an evaluation of the eval_prompt similar to that found in the quick start Jupyter notebook.

PatchouliPatch commented 10 months ago

Hello! Seems like my installation of ROCm 5.7.1 before was borked! For those that encountered the same issues as me, just reinstall ROCM 5.7.1 after uninstalling. I used the following command: sudo amdgpu-install --rocmrelease=5.7.1 --no-dkms