bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.11k stars 611 forks source link

DBRX Support #1155

Open maziyarpanahi opened 6 months ago

maziyarpanahi commented 6 months ago

Feature request

Support for DBRX Instruct model in bitsandbytes

Motivation

DBRX Instruct is supposed to be the best open LLM model, but the 132B makes it unusable for most. I tried this

from transformers import BitsAndBytesConfig
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "/home/maziyar/.cache/huggingface/hub/models--databricks--dbrx-instruct/"

nf4_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_quant_type="nf4",
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.bfloat16
)

model_nf4 = AutoModelForCausalLM.from_pretrained(
    model_id, 
    quantization_config=nf4_config,
    device_map="auto",
    trust_remote_code=True,
)

But it loads the model fully. (maybe I am missing something)

Your contribution

I am willing to test any PR

maziyarpanahi commented 6 months ago

Maybe a relevant conversation: https://huggingface.co/databricks/dbrx-instruct/discussions/10

johnrachwan123 commented 6 months ago

We published a 4 bit bnb version here: https://huggingface.co/PrunaAI/dbrx-base-bnb-4bit :)

maziyarpanahi commented 6 months ago

Thanks @johnrachwan123 - So is the current bitsandbytes compatible with DBRX?

mallorbc commented 6 months ago

The model weights didn't use nn.linear so its not an out of the box solution. There are models out there that have been converted that work right away.

I was able to get this model loaded with bitsandbytes and while I didn't try a generation, I was able to train the model a bit and get a decreasing loss.

SinclairSchneider/dbrx-base-quantization-fixed

Some features do not work, like gradient checkpointing, but I think its good enough for now until its officially supported