intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
https://intel.github.io/neural-compressor/
Apache License 2.0
2.08k stars 249 forks source link

'q_config' is needed when export an INT8 model #1736

Open ZhangShuoAlreadyExists opened 2 months ago

ZhangShuoAlreadyExists commented 2 months ago

Hi,

I want to convert and quantize Pytorch model to ONNX model. I refer to this example https://github.com/intel/neural-compressor/blob/master/examples/pytorch/image_recognition/torchvision_models/export/fx/main.py When calling export function, there is error "'q_config' is needed when export an INT8 model" I don't see anything about q_config in example code. May I know how to solve this issue?

Here is my code: if name == "main": model = timm.create_model('resnet50.a1_in1k', pretrained=True) model = model.eval()

val_dataset = SampleDataset('golden_image')
val_loader = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=1, shuffle=False,
    num_workers=1, pin_memory=True)

conf = PostTrainingQuantConfig(approach='static')
q_model = quantization.fit(model,
                           conf,
                           calib_dataloader=val_loader) # Don't need tuning.
int8_onnx_config = Torch2ONNXConfig(
    dtype="int8",
    opset_version=14,
    quant_format="QDQ",
    example_inputs=torch.randn(1, 3, 224, 224),
    input_names=['input'],
    output_names=['output'],
    dynamic_axes={"input": {0: "batch_size"},
                    "output": {0: "batch_size"}},
)

inc_model = Model(q_model)
inc_model.export("resnet50_int8.onnx", int8_onnx_config)

Error message: neural_compressor\experimental\export\torch2onnx.py", line 389, in torch_to_int8_onnx assert q_config is not None, "'q_config' is needed when export an INT8 model." AssertionError: 'q_config' is needed when export an INT8 model.

torch.version '2.2.2+cpu' neural_compressor.version '2.6'

NeoZhangJianyu commented 2 months ago

@ZhangShuoAlreadyExists We will check it and feedback as soon!

NeoZhangJianyu commented 2 months ago

@ZhangShuoAlreadyExists I create an example as your code. It's passed. Could you refer to it?

Install packages:

pip install neural_compressor
pip install torch torchvision onnxruntime onnx

Code:


import argparse
import os
import random
import shutil
import time
import torch
import torch.nn as nn
import torch.nn.parallel
import torch.distributed as dist
import torch.optim
import torch.multiprocessing as mp
import torch.utils.data
import torch.utils.data.distributed
import torchvision
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models
import torch

from neural_compressor import PostTrainingQuantConfig
from neural_compressor import quantization
from neural_compressor.config import Torch2ONNXConfig

from neural_compressor.model import Model

model = torchvision.models.resnet50(pretrained=True)
model = model.eval()

Transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=(0.5, 0.5, 0.5),
                        std=(0.5, 0.5, 0.5))])

test_data = datasets.CIFAR10(
    root="data",
    train=False,
    download=True,
    transform=Transform,
)

val_loader = torch.utils.data.DataLoader(
    test_data,
    batch_size=1, shuffle=False,
    num_workers=1, pin_memory=True)

conf = PostTrainingQuantConfig(approach='static')
q_model = quantization.fit(model,
                           conf,
                           calib_dataloader=val_loader) # Don't need tuning.
int8_onnx_config = Torch2ONNXConfig(
    dtype="int8",
    opset_version=14,
    quant_format="QDQ",
    example_inputs=torch.randn(1, 3, 224, 224),
    input_names=['input'],
    output_names=['output'],
    dynamic_axes={"input": {0: "batch_size"},
                    "output": {0: "batch_size"}},
)

# inc_model = Model(q_model)
q_model.export("resnet50_int8.onnx", int8_onnx_config)
NeoZhangJianyu commented 2 months ago

@ZhangShuoAlreadyExists Could you confirm if our answer support you?

Thank you!