-
Hello,
I am trying to perform a QAT on a ResNet50 network with BN layers, and I keep getting the following error:
```
ValueError: Shape must be rank 4 but is rank 5 for '{{node batch_normalization_…
-
Need to complete the section - that deals with
Quantizing Inputs {#sec-quantize}
-
When running `examples/quantization/basic_usage_gpt_xl.py` an error occurs during the model packing:
```
2023-05-22 04:08:34 INFO [auto_gptq.quantization.gptq] duration: 0.16880011558532715
2023-…
-
Hi,
Thannk you for the tutorial. I am using python 3.7 to be in line with version but having trouble quantizing the optimized model.
"Generating the quantization table:
Constant is not supporte…
-
Hi, how to cast a float/bfloat16 tensor to fp8? I want to conduct W8A8 (fp8) quantization. But I didn't find an example of quantizing act to FP8 format.
-
Hey I'm using the MX datatypes. It seems like the aten.linear.default function has not been implemented which causes the linear layers in the attenion layers not work with the MX datatypes.
Can you…
-
From my own experience in text generation models, I found out that quantizing the output and embed tensors to f16 and the other tensors to q6_k (or q5_k) gives smaller files and better results that qu…
-
Hi all,
I was trying to quantize my model but something strange popped up.
I am using TensorFlow v2.14 and tfmot v0.7.5
I have a sub-classed tf.Keras.Model. It contains some custom layers and…
-
Hi. First of all, thanks for the awesome work! This issue more of a question. I've been trying to quantize the yolov4 model (I excluded the postprocessing part of the model) by referencing this [tutor…
-
First of all, thank you for great work.
## System info
autoawq==0.1.8
## Details
While I tried to quantize GPT NeoX model, encountered the error below.
```
>>> from awq import AutoAWQForCa…