pytorch / FBGEMM

FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
Other
1.11k stars 446 forks source link

AssertionError: Per channel weight observer is not supported yet for ConvTranspose{n}d. #1576

Open qaixerabbas opened 1 year ago

qaixerabbas commented 1 year ago

I am trying to quantize a Wav2Lip PyTorch model. When I run the code using fbgemm backend. I run into the following error.

AssertionError: Per channel weight observer is not supported yet for ConvTranspose{n}d.

The model is using Conv2DTranspose layers. As per my understanding it should work for other layers. When I change the backend engine to "qnnkpg" that also ran into same problem. but as per "qnnpkg" git repo, Conv2DTranspose is not supported yet. How can I use this "fbgemm" backend to quantize my target model? Any helping material shall be highly appreciated. I am currently using following code.

import torch
from models import Wav2Lip
from quantized import load_model

device = "cuda" if torch.cuda.is_available() else "cpu"
backend = "fbgemm"
# backend = "qnnpack"
checkpoint_path = "model_data/wav2lip_gan.pth"

model = load_model(checkpoint_path)
model.eval()

# Use 'fbgemm' for server inference and 'qnnpack' for mobile inference
# backend = "fbgemm"  # replaced with qnnpack causing much worse inference speed for quantized model on this notebook
model.qconfig = torch.quantization.get_default_qconfig(backend)
torch.backends.quantized.engine = backend

quantized_model = torch.quantization.quantize_dynamic(
    model, qconfig_spec={torch.nn.Linear}, dtype=torch.qint8
)
# scripted_quantized_model = torch.jit.script(quantized_model)
quantized_model.save("wav2lip_gan_quantized.pth")
torch.save(quantized_model, "wav2lip_gan_quantized_int8.pth")
MYTHOFLEGEND commented 2 months ago

Have you solved this problem

qaixerabbas commented 2 months ago

@MYTHOFLEGEND I could not solve this and went for alternatives that I don't remember now.