Open AlpinDale opened 9 months ago
It might be caused by versions differences of depencies. You can try the following code in line 27, it probably can solve the problem.
int8_module.weight = torch.nn.Parameter(module.weight / output_scale)
We will work on this problem and fix it.
It might be caused by versions differences of depencies. You can try the following code in line 27, it probably can solve the problem.
int8_module.weight = torch.nn.Parameter(module.weight / output_scale)
We will work on this problem and fix it.
What are the required versions of PyTorch, CUDA, and Transformers that we need?
I'm trying to quantize Llama2 7b using the instructions in the readme, but get this:
The scales generate correctly.