Xilinx / Vitis-AI

Vitis AI is Xilinx’s development stack for AI inference on Xilinx hardware platforms, including both edge devices and Alveo cards.
https://www.xilinx.com/ai
Apache License 2.0
1.49k stars 630 forks source link

Register Aten Custom Ops #1208

Open dungng27 opened 1 year ago

dungng27 commented 1 year ago

Hi, I'm trying to quantize and compile a PyTorch model with some Aten operations not supported by Vitis-AI yet. Specifically, I'm deploying the model (quantizing the model with test mode) but some errors occured:

[VAIQ_NOTE]: Quant config file is empty, use default quant configuration

[VAIQ_NOTE]: Quantization test process start up...

[VAIQ_NOTE]: =>Quant Module is in 'cpu'.

[VAIQ_NOTE]: =>Parsing RotatedSegmentDetector...

[VAIQ_NOTE]: Start to trace model...
/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/nn/functional.py:2359: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list(input.size()[2:]))

[VAIQ_NOTE]: Finish tracing.

[VAIQ_NOTE]: Processing ops...
██████████████████████████████████████████████████| 403/403 [00:00<00:00, 2273.19it/s, OpInfo: name = return_0, type = Return]                           

[VAIQ_WARN]: The quantizer recognize new op `aten::mul_` as a float operator by default.

[VAIQ_WARN]: The quantizer recognize new op `aten::group_norm` as a float operator by default.

[VAIQ_WARN]: The quantizer recognize new op `aten::clamp_` as a float operator by default.

[VAIQ_WARN]: The quantizer recognize new op `clamp` as a float operator by default.

[VAIQ_NOTE]: =>Doing weights equalization...
/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/nndct_shared/optimization/commander.py:454: RuntimeWarning: divide by zero encountered in true_divide
  scale = np.where(sqrt_of_ranges != 0, range_0 / sqrt_of_ranges, scale)
/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/nndct_shared/optimization/commander.py:454: RuntimeWarning: invalid value encountered in true_divide
  scale = np.where(sqrt_of_ranges != 0, range_0 / sqrt_of_ranges, scale)

[VAIQ_NOTE]: =>Quantizable module is generated.(/workspaces/PheNet_Vitis-AI/quant_model/RotatedSegmentDetector.py)

[VAIQ_NOTE]: =>Get module with quantization.

[VAIQ_NOTE]: =>Converting to xmodel ...
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0328 02:17:35.101768 21213 tool_function.cpp:171] [UNILOG][WARNING] The operator named RotatedSegmentDetector__RotatedSegmentDetector_MobileNetV3_backbone__InvertedResidual_layer1__SELayer_se__ConvModule_conv2__HSigmoid_activate__20467, type: aten::clamp_, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W0328 02:17:35.176827 21213 tool_function.cpp:171] [UNILOG][WARNING] The operator named RotatedSegmentDetector__RotatedSegmentDetector_RotatedFCOSHead_bbox_head__ConvModule_cls_convs__ModuleList_0__GroupNorm_gn__input_309, type: aten::group_norm, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W0328 02:17:35.193320 21213 tool_function.cpp:171] [UNILOG][WARNING] The operator named RotatedSegmentDetector__RotatedSegmentDetector_RotatedFCOSHead_bbox_head__22828, type: cast, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W0328 02:17:35.193936 21213 tool_function.cpp:171] [UNILOG][WARNING] The operator named RotatedSegmentDetector__RotatedSegmentDetector_RotatedFCOSHead_bbox_head__22831, type: clamp, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
F0328 02:17:35.193950 21213 wrapper.cpp:111] [UNILOG][FATAL][PYXIR_INVALID_DATA_TYPE][] Unsupported data type!
*** Check failure stack trace: ***
Aborted (core dumped)

Here is the script i run to quantize and deploy the model:

from mmcv import collect_env
collect_env()

# Check MMRotate installation
import mmrotate
print(mmrotate.__version__)

# Check MMDetection installation
import mmdet
print(mmdet.__version__)

# Check mmcv installation
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
print(get_compiling_cuda_version())
print(get_compiler_version())

# from utils.postprocess import *
import glob
import os.path as osp
import shutil
import os
from tqdm.notebook import tqdm

import mmcv
from mmcv.runner import load_checkpoint

from mmdet.apis import inference_detector, show_result_pyplot
from mmrotate.models import build_detector
import torch
import numpy as np

from pytorch_nndct.apis import torch_quantizer, dump_xmodel
from icecream import ic
import sys
import argparse
import cv2
from utils.preprocess import preprocess
from utils.data import load_data
import glob
from tqdm import tqdm

from pytorch_nndct.apis import Inspector

def parse_args():
    parser = argparse.ArgumentParser(description='Testing config for the Implementation')
    parser.add_argument('-q',  '--quant_mode', type=str, default='test',    
                        choices=['calib','test'], help='Quantization mode (calib or test). Default is calib')
    parser.add_argument('-cf',  '--model_config', type=str, default='')
    parser.add_argument('-c',  '--checkpoint', type=str, default='')
    parser.add_argument('-o',  '--out_model_dir', type=str, default='')
    parser.add_argument('-n',  '--num_data', type=int, default=100)
    args = parser.parse_args()
    return args

def calculate_size(model):
    size = 0
    sl = 0
    for name , param in model.named_parameters():
        ic(name, param.dtype)
        size += sys.getsizeof(param.storage())/1024**2
        sl += param.numel()

    print(f"model size : {size:.3f} MB")    
    print(f"sl param : {sl} ")

if __name__ == '__main__':
    args = parse_args()
    output_dir = args.out_model_dir
    mode = args.quant_mode
    num_data = args.num_data

    ### Loading model
    # Choose to use a config and initialize the detector
    config = args.model_config

    # Setup a checkpoint file to load
    checkpoint = args.checkpoint

    # Set the device to be used for evaluation
    # device='cuda:0'
    device='cpu'

    # Load the config
    config = mmcv.Config.fromfile(config)
    # Set pretrained to be None since we do not need pretrained model here
    config.model.pretrained = None

    # Initialize the detector
    model = build_detector(config.model)

    # Load checkpoint
    checkpoint = load_checkpoint(model, checkpoint, map_location=device)

    # Set the classes of models for inference
    model.CLASSES = checkpoint['meta']['CLASSES']

    # We need to set the model's cfg for inference
    model.cfg = config

    # Convert the model to GPU
    model.to(device)
    # Convert the model into evaluation mode
    model.eval()

    # Specify a target name or fingerprint you want to deploy on
    # target = "DPUCAHX8L_ISA0_SP"
    # # Initialize inspector with target
    # inspector = Inspector(target)

    ### Export model
    model.forward = model.forward_dummy
    images = glob.glob \
    ('/workspaces/PheNet_Vitis-AI/data/slot_data_256_with_angle/valid/images/*.png')[:num_data]
    dummy_input = torch.randn([1, 3, 256, 256])

    # inspector.inspect(model, (dummy_input,), device=torch.device(device), output_dir="inspect", image_format="png") 

    with torch.no_grad():   
        quantizer = torch_quantizer(mode, model, (dummy_input), output_dir=output_dir, device=torch.device("cpu"))  

    quantized_model = quantizer.quant_model
    quantized_model.eval()

    # val_loader, _ = load_data(
    #     images,
    #     subset_len=num_data,
    #     batch_size=64,
    #     sample_method='random',
    # )

    # for iteraction, images in tqdm(
    #   enumerate(val_loader), total=len(val_loader)):
    #     output = quantized_model(images)

    output = quantized_model(dummy_input)

    # for path in tqdm(images):
    #     input_tensor = file2tensor(path)
    #     output = quantized_model(input_tensor)

    if mode == 'calib':
        quantizer.export_quant_config()
    else:
        quantizer.export_xmodel(deploy_check=False, output_dir=output_dir)

with the command:

python test_code/inference_pth.py -q test \
-cf /workspaces/PheNet_Vitis-AI/models/multihead/rotated_fcos_kld_mbnv3_fpn_1x_dota_le90.py \
-c /workspaces/PheNet_Vitis-AI/models/multihead/multihead.pth \
-o /workspaces/PheNet_Vitis-AI/quant_model

I'm using Vitis-AI 2.5 for stability. I read the docs about Register Custom Operation but I don't know how to apply this workflow to register these custom Aten ops. Could someone show me how? Many thanks.

dungng27 commented 1 year ago

It turns out that I can modify these ops by using torch functions rather than tensor's functions and register these. The problem is solved!

dungng27 commented 1 year ago

It turns out that I can modify these ops by using torch functions rather than tensor's functions and register these. The problem is solved!

This is just a temporary fix and not applicable to other ops. I wonder if there is a better work around.

manudwd commented 1 year ago

@dungng27

It turns out that I can modify these ops by using torch functions rather than tensor's functions and register these. The problem is solved!

Can you elaborate a little more on this with a dummy example? I'd really appreciate it.

shaantamchawla commented 1 year ago

I'm also stumped on this, with the specific operators:

aten::meshgrid
aten::split_with_sizes
acg93-pixel commented 1 year ago

Did you manage to add validation for quantized model from MMEngine? Can you share please?

SM2XWe1L commented 2 months ago

I'm also stumped on this, with the specific operators:

aten::meshgrid
aten::split_with_sizes

Hello,i also encounter the same problem as yours,i want to know if you have solved it? Thirsty for your reply!

acg93-pixel commented 2 months ago

Hi, you have to replace these operations in your model definition with supported ones.

SM2XWe1L commented 2 months ago

Hi, you have to replace these operations in your model definition with supported ones.

Thanks for your kind reply! But i still have a problem that i have found meshgrid in make_anchors function,but i can't catch where the split_with_sizes is , could you tell me where aten::split_with_sizes is used?

Thanks.

acg93-pixel commented 2 months ago

Well, not sure which model and repository version are you using. Perhaps in PyCharm try "Find in Files" option with just "split_with_sizes". In my case, I used RTMDet-Ins in MMdetection, but "split_with_sizes" occurred in post-processing, which quantizer anyhow doesn't put in the model. So, everything after the model forward function, you'd have to implement yourself.

SM2XWe1L commented 2 months ago

Well, not sure which model and repository version are you using. Perhaps in PyCharm try "Find in Files" option with just "split_with_sizes". In my case, I used RTMDet-Ins in MMdetection, but "split_with_sizes" occurred in post-processing, which quantizer anyhow doesn't put in the model. So, everything after the model forward function, you'd have to implement yourself.

fully thanks! Actually i am using vitis-ai to quantize yolov8 model , i will try to find the op in post-processing.