THUDM / LongCite

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
Apache License 2.0
243 stars 16 forks source link

huge OutOfMemoryError #7

Closed aceliuchanghong closed 1 week ago

aceliuchanghong commented 1 week ago

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 179.46 GiB. GPU 0 has a total capacity of 31.73 GiB of which 14.55 GiB is free. Process 2995003 has 406.00 MiB memory in use. Process 2375671 has 1.59 GiB memory in use. Including non-PyTorch memory, this process has 15.20 GiB memory in use. Of the allocated memory 11.80 GiB is allocated by PyTorch, and 3.03 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

model is LongCite-llama3.1-8b,why need so much memory i can't run on 32*4,even

Neo-Zhangjiajie commented 1 week ago

Did you try the example code in https://huggingface.co/THUDM/LongCite-llama3.1-8b? You need load the model in bfloat16.

aceliuchanghong commented 1 week ago

o,i see,i tried the demo.py,so it's wrong since i just change model to LongCite-llama3.1-8b

let me try that example

huh,you guys missed import json in that example, i add it then suc

but in project reuqirments.txt it's lack of

torch
tiktoken
accelerate

thankyou for your help