Error with 4 bit quantized LLM

KKNakkav2 commented 7 months ago

In the LLM model_builder.py (https://github.com/alibaba/FederatedScope/blob/7f086944c57f85c7594bde44d4f6b981f0de6845/federatedscope/llm/model/model_builder.py#L22), if we set the model quantization to 4 bit,


    kwargs["load_in_4bit"]=True

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        **kwargs)

It throws the error at the local training after the first aggregation round.

File "/home/krishna/miniconda3/envs/fs-llm/lib/python3.9/site-packages/bitsandbytes/nn/modules.py", line 256, in forward
    out = bnb.matmul_4bit(x, self.weight.t(), bias=bias, quant_state=self.weight.quant_state)
  File "/home/krishna/miniconda3/envs/fs-llm/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py", line 577, in matmul_4bit
    return MatMul4Bit.apply(A, B, out, bias, quant_state)
  File "/home/krishna/miniconda3/envs/fs-llm/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/home/krishna/miniconda3/envs/fs-llm/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py", line 516, in forward
    output = torch.nn.functional.linear(A, F.dequantize_4bit(B, quant_state).to(A.dtype).t(), bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (86x768 and 1x294912)

This error seems to be because of the way the 4bit weights are stored as a 1D tensor (https://github.com/TimDettmers/bitsandbytes/issues/902). Can the authors please advise if you were able to run the experiments with quantizated LLMs? Is there workaround for this issue.

Thank you

KKNakkav2 commented 7 months ago

The error is resolved by updating the Params4Bit file in the bitsandbytes library. Thank you.

jkminder commented 7 months ago

The error is resolved by updating the Params4Bit file in the bitsandbytes library. Thank you.

I have a similar problem. Could you share your changes to the Params4Bit?

Liuyong-zhixing commented 5 months ago

The error is resolved by updating the Params4Bit file in the bitsandbytes library. Thank you.

I have the similar problem using the newest bitsandbytes version(0.43.0). Which bitsandbytes version?

alibaba / FederatedScope

Error with 4 bit quantized LLM #754