Open elfisworking opened 1 month ago
i read the code
in function filter_fn
def filter_fn(child: torch.nn.Module, cur_fqn:str) -> bool:
return isinstance(child, nn.Linear) and (_check_linear_int4_k(child.in_features, groupsize) or padding_allowed)
add a judgment condition child.bias is None
, maybe a solution?
For example
def filter_fn(child: torch.nn.Module, cur_fqn:str) -> bool:
return isinstance(child, nn.Linear) and (_check_linear_int4_k(child.in_features, groupsize) or padding_allowed) and child.bias is None
skip the linear layer where bias is True
cc @andrewor14 can you take a look
Hi @elfisworking, yes the easy fix would be to skip the replacement when bias is False. Would you like to submit a fix for this? If not I can do it too.
Probably the longer term fix would be to actually support the bias=True
case. This is currently not supported because the quantized linear used in the convert path (Int8DynActInt4WeightLinear
) does not support bias. If we make the convert path call the tensor subclass path (using quantize_(model, int8_dynamic_activations_int4_weight())
) instead, then this problem will be resolved. This is on my TODO list.
@andrewor14 ok, i will submit a fix.
i use
Int8DynActInt4WeightQATQuantizer
to quantize qwen2-1.5B model. But after prepare function, i find that bias is set to False. This is my CodeThe output is
we can see that after prepare function,
(q_proj): Linear(in_features=1536, out_features=1536, bias=True)
has been(q_proj): Int8DynActInt4WeightQATLinear(in_features=1536, out_features=1536, bias=False)
From torchao code, we can see In functionbias is set to False. So has any Solution about this problem ?