defog-ai / sqlcoder

SoTA LLM for converting natural language questions to SQL queries
Apache License 2.0
3.27k stars 205 forks source link

How big will sqlcoder-7B occupy a memory, 16G memory can't be used, and 7B's LLM can run. #72

Closed DAAworld closed 6 months ago

DAAworld commented 6 months ago

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 124.00 MiB. GPU 0 has a total capacity of 15.77 GiB of which 112.12 MiB is free. Including non-PyTorch memory, this process has 15.66 GiB memory in use. Of the allocated memory 14.08 GiB is allocated by PyTorch, and 698.53 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

rishsriv commented 6 months ago

Hi there, the SQLCoder model will take 14GB memory of loaded in fp16. There will be a bit more for activations. You can use 16GB for small sequences if you turn off beam search. Alternatively, you can load the model in int8 or int4, using the load_in_4bit or load_in_8bit parameters when you load the model. Like

model = AutoModelForCausalLM.from_pretrained(
        "defog/sqlcoder-7b-2",
        trust_remote_code=True,
        device_map="auto",
        use_cache=True,
        load_in_4bit=True
    )