Lightning-AI / lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Apache License 2.0
6k stars 520 forks source link

(documentation) How do I know if generate.py is running on GPU / GPU configuration #449

Open maathieu opened 1 year ago

maathieu commented 1 year ago

Hi, I have a NVidia Quadro P5200 with 32GB of VRAM, yet when I run the codes for a test they perform extremely slowly and in the task manager the GPU's used ram stays near 0. I think the code is not using my GPU. Is there special configuration to do beyond pip install -r requirements.txt to get this running on GPU?

rasbt commented 1 year ago

In general, if you start a new Python session, does

import torch
print(torch.cuda.is_available())

show True?