Open LameloBally opened 6 months ago
bnbconfig with Mistral-v1 is not quantized into 4 bit even though I used load_in_4bit=True
bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=False, )
model = AutoModelForCausalLM.from_pretrained( "mistralai/Mistral-7B-v0.1", quantization_config=bnb_config, device_map={"": 0}, torch_dtype=torch.bfloat16 )
module = list(model.modules()) module[20].weight.data
module[20].weight.data should be uint4 not uint8
Hi @LameloBally,
The data is in packed into uint8 for storage, but each element actually holds two 4-bit values.
System Info
bnbconfig with Mistral-v1 is not quantized into 4 bit even though I used load_in_4bit=True
Reproduction
bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_use_double_quant=False, )
model = AutoModelForCausalLM.from_pretrained( "mistralai/Mistral-7B-v0.1", quantization_config=bnb_config, device_map={"": 0}, torch_dtype=torch.bfloat16 )
module = list(model.modules()) module[20].weight.data
Expected behavior
module[20].weight.data should be uint4 not uint8