int 8 and fp16 accuracy

blue-q commented 1 year ago

Checklist

[ ] I have searched related issues but cannot get the expected help.
[ ] 2. I have read the FAQ documentation but cannot get the expected help.
[ ] 3. The bug has not been fixed in the latest version.

Describe the bug

I trained a model, and the accuracy is ok when using pth test, but after converting to fp16 or int8, I tested a number of different categories of videos, the output category is the same, and the score is less than 1, but the same video is predicted with pth model and the score is relatively large, what is the reason? I trained a model, The accuracy is ok when using pth test, but after converting to fp16 or int8, multiple different categories of videos are tested, the output category is the same, and the score is less than 1, but the same video is predicted with pth model and the score is relatively large, what is the reason

Reproduction

python tools/deploy.py configs/mmaction/video-recognition/video-recognition_2d_tensorrt_static-224x224.py tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_kinetics400-rgb.py best_acc_top1_epoch_24.pth tests/data/arm_wrestling.mp4 --work-dir mmdeploy_models/mmaction/tsn_fp16_test/ort --device cuda --show --dump-info

Environment

06/05 10:20:07 - mmengine - INFO - **********Backend information**********
06/05 10:20:07 - mmengine - INFO - tensorrt:    7.2.3.4
06/05 10:20:07 - mmengine - INFO - tensorrt custom ops: Available
06/05 10:20:07 - mmengine - INFO - ONNXRuntime: None
06/05 10:20:07 - mmengine - INFO - pplnn:       None
06/05 10:20:07 - mmengine - INFO - ncnn:        None
06/05 10:20:07 - mmengine - INFO - snpe:        None
06/05 10:20:07 - mmengine - INFO - openvino:    None
06/05 10:20:07 - mmengine - INFO - torchscript: 1.12.1
06/05 10:20:07 - mmengine - INFO - torchscript custom ops:      NotAvailable
06/05 10:20:07 - mmengine - INFO - rknn-toolkit:        None
06/05 10:20:07 - mmengine - INFO - rknn-toolkit2:       None
06/05 10:20:07 - mmengine - INFO - ascend:      None
06/05 10:20:07 - mmengine - INFO - coreml:      None
06/05 10:20:07 - mmengine - INFO - tvm: None
06/05 10:20:07 - mmengine - INFO - vacc:        None
06/05 10:20:07 - mmengine - INFO -

Error traceback

No response

irexyc commented 1 year ago

Hi @blue-q , could you provide the test script and bug video that you have different results with pth and tensorrt.

blue-q commented 1 year ago

Hi @blue-q , could you provide the test script and bug video that you have different results with pth and tensorrt.

I found the reason, in mmdeploy.so all use cv::mat, but I use cv::cuda::gpumat so this problem occurs, mmdeploy currently does not support gpumat? For the tsn task of video classification, if I wanted to change the relevant cpumat in mmdeploy to gpumat, how much work would I need to do and what changes would I need to make.

irexyc commented 1 year ago

Hi @blue-q ,

You can use gpumat but make sure the data is continous. And you have to manage the lifetime of gpumat.data

I made some changes to demo/csrc/c/image_classification.cpp that you can refer to:

#include <fstream>
#include <opencv2/core/cuda.hpp>
#include <opencv2/imgcodecs/imgcodecs.hpp>
#include <string>

#include "mmdeploy/classifier.h"

int main(int argc, char* argv[]) {
  if (argc != 4) {
    fprintf(stderr, "usage:\n  image_classification device_name dump_model_directory image_path\n");
    return 1;
  }
  auto device_name = argv[1];
  auto model_path = argv[2];
  auto image_path = argv[3];
  cv::Mat img = cv::imread(image_path);
  if (!img.data) {
    fprintf(stderr, "failed to load image: %s\n", image_path);
    return 1;
  }

  mmdeploy_classifier_t classifier{};
  int status{};
  status = mmdeploy_classifier_create_by_path(model_path, device_name, 0, &classifier);
  if (status != MMDEPLOY_SUCCESS) {
    fprintf(stderr, "failed to create classifier, code: %d\n", (int)status);
    return 1;
  }

  // mmdeploy_mat_t mat{
  //     img.data, img.rows, img.cols, 3, MMDEPLOY_PIXEL_FORMAT_BGR, MMDEPLOY_DATA_TYPE_UINT8};

  cv::cuda::GpuMat gpumat;
  cv::cuda::createContinuous(img.rows, img.cols, img.type(), gpumat);

  gpumat.upload(img);
  // printf("gpumat.isContinuous() %d\n", (int)gpumat.isContinuous());
  mmdeploy_device_t device;
  mmdeploy_device_create("cuda", 0, &device);
  mmdeploy_mat_t mat{
      gpumat.data, img.rows, img.cols, 3, MMDEPLOY_PIXEL_FORMAT_BGR, MMDEPLOY_DATA_TYPE_UINT8,
      device};

  mmdeploy_classification_t* res{};
  int* res_count{};
  status = mmdeploy_classifier_apply(classifier, &mat, 1, &res, &res_count);
  if (status != MMDEPLOY_SUCCESS) {
    fprintf(stderr, "failed to apply classifier, code: %d\n", (int)status);
    return 1;
  }
  for (int i = 0; i < res_count[0]; ++i) {
    fprintf(stderr, "label: %d, score: %.4f\n", res[i].label_id, res[i].score);
  }

  mmdeploy_classifier_release_result(res, res_count, 1);

  mmdeploy_classifier_destroy(classifier);

  mmdeploy_device_destroy(device);

  return 0;
}

irexyc commented 1 year ago

@blue-q

你可以看下上面我修改的代码和仓库中代码的区别。主要的就是一点，创建mmdeploy_mat_t的时候，会有一个device参数，你的代码里面没有。

mmdeploy_mat_t mat{
      gpumat.data, img.rows, img.cols, 3, MMDEPLOY_PIXEL_FORMAT_BGR, MMDEPLOY_DATA_TYPE_UINT8,
      device};

另外mmdeploy_mat_t中的data内存是需要连续的。你的代码中，没有体现_image是怎么得到的，你可以用isContinuous()来检查是否连续。如果不连续，可能需要使用createContinuous和copyTo来达到目的。

还有就是Gpumat的创建销毁也挺耗时的，你可以考虑使用显存池来规避这一点。

github-actions[bot] commented 1 year ago

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

open-mmlab / mmdeploy