openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
7.25k stars 2.26k forks source link

PTOT quantization #10617

Closed YeziXuannao closed 2 years ago

YeziXuannao commented 2 years ago

System information (version)

Detailed description

Hello, there. I have an openvino model to recongnize person body and it worked quite nice. Now I want to quantize the openvino model to int8. And I'm experiencing a problem. The int8 model can't find any person body!(height and width of the bounding box is 0) I tried command line interface and PTOT API, there is no one worked. I searched a lot of information, but came up empty. Is there anyone could help me?

Steps to reproduce

PTOT API code
import os
import copy
import numpy as np
import cv2
from addict import Dict

from compression.api import DataLoader, Metric
from compression.engines.ie_engine import IEEngine
from compression.graph import load_model, save_model
from compression.graph.model_utils import compress_model_weights
from compression.pipeline.initializer import create_pipeline
from openvino.inference_engine import IECore
from pathlib import Path

class ClassificationDataLoader(DataLoader):
    """
    DataLoader for image data that is stored in a directory per category. For example, for
    categories _rose_ and _daisy_, rose images are expected in data_source/rose, daisy images
    in data_source/daisy.
    """

    def __init__(self, data_source):
        """
        :param data_source: path to data directory
        """
        self.dataset = [os.path.join(data_source, x) for x in os.listdir(data_source)]

    def __len__(self):
        """
        Returns the number of elements in the dataset
        """
        return len(self.dataset)

    def __getitem__(self, index):
        """
        Get item from self.dataset at the specified index.
        Returns (annotation, image), where annotation is a tuple (index, class_index)
        and image a preprocessed image in network shape
        """
        if index >= len(self):
            raise IndexError
        filepath = self.dataset[index]
        print(filepath)
        image = self._read_image(filepath, (640, 640), (114, 114, 114))
        return (index, 0), image

    def _read_image(self, origin_image_path, target_size, padding_rgb_value):
        """
        Resize image and then add the border(keep the rate of height/width)
        :param origin_image_path: String, origin image path
        :param target_size: Int in list, the first element is the target height, the second element is the target width
        :param padding_rgb_value: Int in list, RGB value, used in adding border, like (114, 114, 114)
        return: Array, the shape is (3, target_size(0), target_size(1))
        """
        # Read image
        origin_image = cv2.imread(origin_image_path)
        origin_shape = origin_image.shape

        # Height and width
        origin_height, origin_width = origin_shape[0], origin_shape[1]
        target_height, target_width = target_size

        # ################## Resize ############################
        # Resize rate
        rate = min(target_height / origin_height, target_width / origin_width)
        # Resize shape
        new_shape = int(origin_height * rate), int(origin_width * rate)
        # Resize
        new_image = cv2.resize(origin_image, (new_shape[1], new_shape[0]), interpolation=cv2.INTER_LINEAR)
        # ######################################################

        # ################### Padding ###########################################
        dh, dw = target_height - new_shape[0], target_width - new_shape[1]
        dh /= 2
        dw /= 2
        # Plus/minus 0.1 could make sure the shape is the target shape, won't bigger or less
        top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
        left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
        # add border
        new_image = cv2.copyMakeBorder(new_image, top, bottom, left, right, cv2.BORDER_CONSTANT,
                                       value=padding_rgb_value)
        # BGR to RGB, HWC TO CHW
        new_image = new_image.transpose(2, 0, 1)[::-1]

        # To array
        new_image = np.array(new_image).astype(np.float32)
        # Convert to contiguous array
        new_image = np.ascontiguousarray(new_image)
        # Normalization (Generally speaking this step is necessary)
        new_image /= 255

        # new_image = new_image[np.newaxis, :, :, :]

        return new_image

model_config = Dict(
    {
        "model_name": "yolov5",
        "model": "/ye/yolov5_old/last_test.xml",
        "weights": "/ye/yolov5_old/last_test.bin",
    }
)

engine_config = Dict({"device": "CPU", "stat_requests_number": 2, "eval_requests_number": 2})

algorithms = [
    {
        "name": "DefaultQuantization",
        "params": {
            "target_device": "CPU",
            "preset": "performance",
            "stat_subset_size": 500,
        },
    }
]

if __name__ == '__main__':
    # Step 1: Load the model
    model = load_model(model_config=model_config)
    original_model = copy.deepcopy(model)
    print('step 1 done')

    # Step 2: Initialize the data loader
    data_loader = ClassificationDataLoader(data_source='/ye/datasets/coco_person_500')
    print('step 2 done')

    # Step 3 (Optional. Required for AccuracyAwareQuantization): Initialize the metric
    #        Compute metric results on original model
    # metric = Accuracy()

    # Step 4: Initialize the engine for metric calculation and statistics collection
    engine = IEEngine(config=engine_config, data_loader=data_loader)
    print('step 4 done')

    # Step 5: Create a pipeline of compression algorithms
    pipeline = create_pipeline(algo_config=algorithms, engine=engine)
    print('step 5 done')

    # Step 6: Execute the pipeline
    compressed_model = pipeline.run(model=model)
    print('step 6 done')

    # Step 7 (Optional): Compress model weights quantized precision
    #                    in order to reduce the size of final .bin file
    compress_model_weights(model=compressed_model)
    print('step 7 done')

    # Step 8: Save the compressed model and get the path to the model
    compressed_model_paths = save_model(
        model=compressed_model, save_path=os.path.join(os.path.curdir, "model/optimized")
    )
    compressed_model_xml = Path(compressed_model_paths[0]["model"])
    print(f"The quantized model is stored in {compressed_model_xml}")

    # # Step 9 (Optional): Evaluate the original and compressed model. Print the results
    # original_metric_results = pipeline.evaluate(original_model)
    # if original_metric_results:
    #     print(f"Accuracy of the original model:  {next(iter(original_metric_results.values())):.5f}")
    #
    # quantized_metric_results = pipeline.evaluate(compressed_model)
    # if quantized_metric_results:
    #     print(f"Accuracy of the quantized model: {next(iter(quantized_metric_results.values())):.5f}")

Command line

  1. I manually resize and pad the images. Then save the images
  2. Use the command pot -c /ye/yolov5s.json yolov5s.json

    {
    "model": {
        "model_name": "yolov5s-pytorch",
        "model": "/ye/yolov5_old/last_test.xml",
        "weights": "/ye/yolov5_old/last_test.bin"
    },
    "engine": {
        "config": "/ye/configuration.yaml"
    },
    "compression": {
        "algorithms": [
            {
                "name": "DefaultQuantization",
                "params": {
                    "preset": "mixed",          
                    "stat_subset_size": 100
                }
            }
        ]
    }
    }

    configuration.yaml

    models:
    - name: last_test
    launchers:
      - framework: dlsdk
        batch: 1
        device: CPU
        adapter: classification
    
    datasets:
      - name: coco_person
        data_source: /ye/datasets/coco_person_500
        # annotation_conversion:
        #   converter: imagenet
        #   annotation_file: ./ImageNet/val.txt
        reader: pillow_imread
    
        preprocessing:
           - type: normalization
             std: 255
Iffa-Intel commented 2 years ago

Hi @YeziXuannao ,

please provide these:

  1. Your YOLOv5 source
  2. The MO conversion command (IR)
  3. The Quantization command (POT/int8)
  4. The inferencing command
  5. Model files (IR & int8)
YeziXuannao commented 2 years ago

Hi @Iffa-Meah , I've solved the problem, Thanks for your attention!

BackT0TheFuture commented 2 years ago

@YeziXuannao Hi, I came to the same problem yolov5 FP32 works well but It can detection nothing after INT8 could you tell me what the solution is ? thanks

YeziXuannao commented 2 years ago

@YeziXuannao Hi, I came to the same problem yolov5 FP32 works well but It can detection nothing after INT8 could you tell me what the solution is ? thanks

Hi, @goodtogood My problem is I convert the model by myself instead of using the export.py in the yolov5 project. The pytorch compatibility not worked correctly. It can't recognize the class Detect and the class Model. So the value of inplace attribute is incorrect. You can find the compatible operation in the 100 line of the yolov5/models/experimental.py Hope that can help you!

BackT0TheFuture commented 2 years ago

@YeziXuannao thanks for your information! I will take a look. my problem is a little different from yours. the score is very very low after INT8 but not zero. DefaultQuantization was used. but yolox works well after quantization in same way. currently I did not find what the problem is

sajeevrajput commented 2 years ago

@YeziXuannao Hi, I came to the same problem yolov5 FP32 works well but It can detection nothing after INT8 could you tell me what the solution is ? thanks

Hi, @goodtogood My problem is I convert the model by myself instead of using the export.py in the yolov5 project. The pytorch compatibility not worked correctly. It can't recognize the class Detect and the class Model. So the value of inplace attribute is incorrect. You can find the compatible operation in the 100 line of the yolov5/models/experimental.py Hope that can help you!

Hi @YeziXuannao , I've run into the same quantization issue where detection area is 0(height and width of the bounding box is 0). In my case, I first converted the yolov5s.pt model to .onnx file using torch apis as documented here. This gives me an onnx model that I then convert to FP32 OpenVINO IRs using model optimizer. Until this point, the detections are fine. Once I quantize these IRs, I dont get the detections at all.

Did you follow the same approach as mine? At what point you had to use export.py that solved your issue ? As I am aware, export.py can get onnx as well as openvino IRs as output

JFL888 commented 2 years ago

嗨,@Iffa-米亚,我已经解决了这个问题,感谢您的关注!

嗨,你好!你是怎么解决openvino转换成int8后有问题的?我转换成int8后map是0