bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.
https://huggingface.co/docs/bitsandbytes/main/en/index
MIT License
6.23k stars 626 forks source link

Mistral-v0.1 nf4 is not quantized into 4bit #1203

Open LameloBally opened 6 months ago

LameloBally commented 6 months ago

System Info

bnbconfig with Mistral-v1 is not quantized into 4 bit even though I used load_in_4bit=True

Reproduction

bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=False, )

model = AutoModelForCausalLM.from_pretrained( "mistralai/Mistral-7B-v0.1", quantization_config=bnb_config, device_map={"": 0}, torch_dtype=torch.bfloat16 )

module = list(model.modules()) module[20].weight.data

Expected behavior

module[20].weight.data should be uint4 not uint8

matthewdouglas commented 5 months ago

Hi @LameloBally,

The data is in packed into uint8 for storage, but each element actually holds two 4-bit values.