open-mmlab / mmdeploy

OpenMMLab Model Deployment Framework
https://mmdeploy.readthedocs.io/en/latest/
Apache License 2.0
2.69k stars 617 forks source link

scripts to validate whether the performance of the deployed model matches that of its PyTorch model #216

Closed lvhan028 closed 2 years ago

lvhan028 commented 2 years ago

Motivation

As a part of the regression test, the performance of a deployed model should be checked if it is consistent with the performance of its PyTorch model. In other words, MMDeploy should test the metrics of a deployed model, and check if it matches the metrics reported by the PyTorch model.

Requirement

python scripts.py <args>

TODO Discuss about proper args

PeterH0323 commented 2 years ago

Hi lvhan028, here is my idea. 😄

Prerequisite

Before using this script, the user have to install the backend tools which he/she wants to use. Such as tensorrt, onnx and so on.

Usage

python ./tools/accuracy_test.py \
    --deploy-cfg "${DEPLOY_CFG_PATH}" \
    --model-cfg "${MODEL_CFG_PATH}" \
    --checkpoint "${MODEL_CHECKPOINT_PATH}" \
    --dataset-path "${TEST_DATASET_PATH}" \
    --work-dir "${WORK_DIR}" \
    --calib-dataset-cfg "${CALIB_DATA_CFG}" \
    --device "${DEVICE}" \
    --log-level INFO \
    --show \
    --dump-info

Description of all arguments:

--deploy-cfg: The config for deployment. --model-cfg: The config of the model in OpenMMLab codebases. --checkpoint : The path of model checkpoint file. --dataset-path: The dataset for test pipeline --work-dir : The path of work directory that used to save logs and models. --calib-dataset-cfg : Config used for calibration. If not specified, it will be set to None. --device : The device used for conversion. If not specified, it will be set to cpu. --log-level : To set log level which in 'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'. If not specified, it will be set to INFO. --show : Whether to show detection outputs. --dump-info : Whether to output information for SDK.

Example

python ./tools/accuracy_test.py \
    --deploy-cfg "configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py" \
    --model-cfg "$PATH_TO_MMDET/configs/yolo/yolov3_d53_mstrain-608_273e_coco.py" \
    --checkpoint "$PATH_TO_MMDET/checkpoints/yolo/yolov3_d53_mstrain-608_273e_coco.pth" \
    --dataset-path "${PATH_TO_MMDET}/data/coco/val2017" \
    --work-dir "${WORK_DIR}" \
    --show \
    --device cuda:0

Note

To do so, the script will do the follow step to complete:

  1. Use the model_cfg to load the test_pipeline to test the dataset which will show the baseline of the model, which I name it baseline A.
  2. Script convert the model according to the deploy_cfg, use tensorrt fp16 for example, we can get the A_trt_fp16.engine after this.
  3. Like step 1, using the test_pipeline to run the test, then we can have the baseline B
  4. Show the result of compareson of baseline A and baseline B using the terminal, like
model_cfg hmean precision recall FPS
yolov3_d53_mstrain-608_273e_coco.py 0.95 0.98 0.88 30
detection_tensorrt_dynamic-320x320-1344x1344.py 0.95 0.97 0.87 300
lvhan028 commented 2 years ago

Hi lvhan028, here is my idea. smile

Prerequisite

Before using this script, the user have to install the backend tools which he/she wants to use. Such as tensorrt, onnx and so on.

Usage

python ./tools/accuracy_test.py \
    --deploy-cfg "${DEPLOY_CFG_PATH}" \
    --model-cfg "${MODEL_CFG_PATH}" \
    --checkpoint "${MODEL_CHECKPOINT_PATH}" \
    --dataset-path "${TEST_DATASET_PATH}" \
    --work-dir "${WORK_DIR}" \
    --calib-dataset-cfg "${CALIB_DATA_CFG}" \
    --device "${DEVICE}" \
    --log-level INFO \
    --show \
    --dump-info

Description of all arguments:

--deploy-cfg: The config for deployment. --model-cfg: The config of the model in OpenMMLab codebases. --checkpoint : The path of model checkpoint file. --dataset-path: The dataset for test pipeline --work-dir : The path of work directory that used to save logs and models. --calib-dataset-cfg : Config used for calibration. If not specified, it will be set to None. --device : The device used for conversion. If not specified, it will be set to cpu. --log-level : To set log level which in 'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'. If not specified, it will be set to INFO. --show : Whether to show detection outputs. --dump-info : Whether to output information for SDK.

Example

python ./tools/accuracy_test.py \
    --deploy-cfg "configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py" \
    --model-cfg "$PATH_TO_MMDET/configs/yolo/yolov3_d53_mstrain-608_273e_coco.py" \
    --checkpoint "$PATH_TO_MMDET/checkpoints/yolo/yolov3_d53_mstrain-608_273e_coco.pth" \
    --dataset-path "${PATH_TO_MMDET}/data/coco/val2017" \
    --work-dir "${WORK_DIR}" \
    --show \
    --device cuda:0

Note

To do so, the script will do the follow step to complete:

  1. Use the model_cfg to load the test_pipeline to test the dataset which will show the baseline of the model, which I name it baseline A.
  2. Script convert the model according to the deploy_cfg, use tensorrt fp16 for example, we can get the A_trt_fp16.engine after this.
  3. Like step 1, using the test_pipeline to run the test, then we can have the baseline B
  4. Show the result of compareson of baseline A and baseline B using the terminal, like

model_cfg hmean precision recall FPS yolov3_d53_mstrain-608_273e_coco.py 0.95 0.98 0.88 30 detection_tensorrt_dynamic-320x320-1344x1344.py 0.95 0.97 0.87 300

I think it's better to do the job on a batch of models and a batch of backends, not just one. Meanwhile, the compute precision such as FP32, FP16 and INT8 should be taken into consideration, too. Regarding the final report, it is necessary to show the matching result (Yes or No) besides the performance number.

PeterH0323 commented 2 years ago

Hi lvhan028, here is my idea. smile

Prerequisite

Before using this script, the user have to install the backend tools which he/she wants to use. Such as tensorrt, onnx and so on.

Usage

python ./tools/accuracy_test.py \
    --deploy-cfg "${DEPLOY_CFG_PATH}" \
    --model-cfg "${MODEL_CFG_PATH}" \
    --checkpoint "${MODEL_CHECKPOINT_PATH}" \
    --dataset-path "${TEST_DATASET_PATH}" \
    --work-dir "${WORK_DIR}" \
    --calib-dataset-cfg "${CALIB_DATA_CFG}" \
    --device "${DEVICE}" \
    --log-level INFO \
    --show \
    --dump-info

Description of all arguments:

--deploy-cfg: The config for deployment. --model-cfg: The config of the model in OpenMMLab codebases. --checkpoint : The path of model checkpoint file. --dataset-path: The dataset for test pipeline --work-dir : The path of work directory that used to save logs and models. --calib-dataset-cfg : Config used for calibration. If not specified, it will be set to None. --device : The device used for conversion. If not specified, it will be set to cpu. --log-level : To set log level which in 'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'. If not specified, it will be set to INFO. --show : Whether to show detection outputs. --dump-info : Whether to output information for SDK.

Example

python ./tools/accuracy_test.py \
    --deploy-cfg "configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py" \
    --model-cfg "$PATH_TO_MMDET/configs/yolo/yolov3_d53_mstrain-608_273e_coco.py" \
    --checkpoint "$PATH_TO_MMDET/checkpoints/yolo/yolov3_d53_mstrain-608_273e_coco.pth" \
    --dataset-path "${PATH_TO_MMDET}/data/coco/val2017" \
    --work-dir "${WORK_DIR}" \
    --show \
    --device cuda:0

Note

To do so, the script will do the follow step to complete:

  1. Use the model_cfg to load the test_pipeline to test the dataset which will show the baseline of the model, which I name it baseline A.
  2. Script convert the model according to the deploy_cfg, use tensorrt fp16 for example, we can get the A_trt_fp16.engine after this.
  3. Like step 1, using the test_pipeline to run the test, then we can have the baseline B
  4. Show the result of compareson of baseline A and baseline B using the terminal, like

model_cfg hmean precision recall FPS yolov3_d53_mstrain-608_273e_coco.py 0.95 0.98 0.88 30 detection_tensorrt_dynamic-320x320-1344x1344.py 0.95 0.97 0.87 300

I think it's better to do the job on a batch of models and a batch of backends, not just one. Meanwhile, the compute precision such as FP32, FP16 and INT8 should be taken into consideration, too. Regarding the final report, it is necessary to show the matching result (Yes or No) besides the performance number.

Hi lvhan028, I have some quesents about your reply: 😄

lvhan028 commented 2 years ago

Hi lvhan028, here is my idea. smile

Prerequisite

Before using this script, the user have to install the backend tools which he/she wants to use. Such as tensorrt, onnx and so on.

Usage

python ./tools/accuracy_test.py \
    --deploy-cfg "${DEPLOY_CFG_PATH}" \
    --model-cfg "${MODEL_CFG_PATH}" \
    --checkpoint "${MODEL_CHECKPOINT_PATH}" \
    --dataset-path "${TEST_DATASET_PATH}" \
    --work-dir "${WORK_DIR}" \
    --calib-dataset-cfg "${CALIB_DATA_CFG}" \
    --device "${DEVICE}" \
    --log-level INFO \
    --show \
    --dump-info

Description of all arguments:

--deploy-cfg: The config for deployment. --model-cfg: The config of the model in OpenMMLab codebases. --checkpoint : The path of model checkpoint file. --dataset-path: The dataset for test pipeline --work-dir : The path of work directory that used to save logs and models. --calib-dataset-cfg : Config used for calibration. If not specified, it will be set to None. --device : The device used for conversion. If not specified, it will be set to cpu. --log-level : To set log level which in 'CRITICAL', 'FATAL', 'ERROR', 'WARN', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'. If not specified, it will be set to INFO. --show : Whether to show detection outputs. --dump-info : Whether to output information for SDK.

Example

python ./tools/accuracy_test.py \
    --deploy-cfg "configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py" \
    --model-cfg "$PATH_TO_MMDET/configs/yolo/yolov3_d53_mstrain-608_273e_coco.py" \
    --checkpoint "$PATH_TO_MMDET/checkpoints/yolo/yolov3_d53_mstrain-608_273e_coco.pth" \
    --dataset-path "${PATH_TO_MMDET}/data/coco/val2017" \
    --work-dir "${WORK_DIR}" \
    --show \
    --device cuda:0

Note

To do so, the script will do the follow step to complete:

  1. Use the model_cfg to load the test_pipeline to test the dataset which will show the baseline of the model, which I name it baseline A.
  2. Script convert the model according to the deploy_cfg, use tensorrt fp16 for example, we can get the A_trt_fp16.engine after this.
  3. Like step 1, using the test_pipeline to run the test, then we can have the baseline B
  4. Show the result of compareson of baseline A and baseline B using the terminal, like

model_cfg hmean precision recall FPS yolov3_d53_mstrain-608_273e_coco.py 0.95 0.98 0.88 30 detection_tensorrt_dynamic-320x320-1344x1344.py 0.95 0.97 0.87 300

I think it's better to do the job on a batch of models and a batch of backends, not just one. Meanwhile, the compute precision such as FP32, FP16 and INT8 should be taken into consideration, too. Regarding the final report, it is necessary to show the matching result (Yes or No) besides the performance number.

Hi lvhan028, I have some quesents about your reply: smile

  • I think it's better to do the job on a batch of models and a batch of backends, not just one.: In my case, i will use pytorch to choose a specific model to train my own dataset, in this step i already know which model is the best for me. After i trained, i am eager to test the precision and other score of the model after convert to deployable backend. I can see the needs of a batch of backends, but I want to know what situation that the user wants to compare a batch of models before he/she deploy?
  • Meanwhile, the compute precision such as FP32, FP16 and INT8 should be taken into consideration, too: I think this is nessary for user. In my original idea i will test on pytorch -> FP32 -> FP16 -> INT8, the user just need to set the config like : --deploy-cfg xxx_fp16_xxx.py,xxx_fp32_xxx.py,xxx_int8_xxx.py, it is okay?
  • Regarding the final report, it is necessary to show the matching result (Yes or No) besides the performance number.: The scripy will generate a report, use bold to mark the best result. In my case there are max probability will appear the score wont match, such as pytorch hmean is 0.95 but tensorrt FP16 is 0.96. I want to know the (Yes or No) means equal or higher is Yes?

This script is not just for users. It is for MMDeploy's regression test. That is to say, every time MMDeploy releases a new version, it has to do a full test. Test whether all supported models can be deployed to all supported backends correctly. How to measure the correctness? Comparing the training model and the deployed model metrics.

PeterH0323 commented 2 years ago

...

This script is not just for users. It is for MMDeploy's regression test. That is to say, every time MMDeploy releases a new version, it has to do a full test. Test whether all supported models can be deployed to all supported backends correctly. How to measure the correctness? Comparing the training model and the deployed model metrics.

OK, I know what to do. 😆

Prerequisite

Before using this script, the user have to install the backend tools which he/she wants to use. Such as tensorrt, onnx and so on. If it is MMDeploy's regression test, it will require to install all the backend.

Usage

python ./tools/accuracy_test.py \
    --deploy-cfg "${DEPLOY_CFG_PATH}" \
    --model-cfg "${MODEL_CFG_PATH}" \
    --checkpoint "${MODEL_CHECKPOINT_PATH}" \
    --dataset-path "${TEST_DATASET_PATH}" \
    --work-dir "${WORK_DIR}" \
    --calib-dataset-cfg "${CALIB_DATA_CFG}" \
    --device "${DEVICE}" \
    --log-level INFO \
    --show \
    --dump-info

Description of all arguments:

model-cfg.yaml example

mmdet:
    code_dir: ${PATH_TO_MMDET}
    checkpoint_dir: ${PATH_TO_MMDET_CHECKPOINT_DIR}
    dataset_dir:  # TODO: if it is nessary to set it under each models ?
        - coco: ${PATH_TO_COCO_DATASET_DIR}
        - xxx
    calib-dataset-cfg:  # TODO: if it is nessary to set it under each models ?
        - coco: ${PATH_TO_COCO_CALIB_DATASET_CFG}
        - xxx
    models:
        - hrnet 
        - yolo
        - yolox
        - yolof
        - ...

mmcls:
      (same as mmdet)

mmpose:
      (same as mmdet)

Regression Test

Example

python ./tools/accuracy_test.py \
    --deploy-cfg "./configs/mmdet" \  # if set to "./configs" then will test all
    --model-cfg "./test/model-cfg.yaml" \
    --work-dir "${WORK_DIR}" \
    --show \
    --device cuda:0

Step

To do so, the script will do the follow step to complete:

  1. Load the files below in the direcory deploy-cfg. Meanwhile, it will ignore the directory which named _base_.
  2. According to the direcory deploy-cfg name and read model-cfg.yaml to get the models which need to be tested, ergodic the models to load the test_pipeline to test the dataset which will show the baseline of the pytorch model.
  3. Use one by one file to convert the model below direcory deploy-cfg, and then like step 1, using the test_pipeline to run the test.
  4. Show the result of compareson of all models and backends using the terminal, like below , meanwhile it will save in a excel file
model_type model_name model_cfg deploy_cfg hmean precision recall FPS test pass
mmdet yolov3_d53_mstrain-608_273e_coco.pth yolov3_d53_mstrain-608_273e_coco.py - 0.95 0.98 0.88 30 -
mmedt yolov3_d53_mstrain-608_273e_coco.pth yolov3_d53_mstrain-608_273e_coco.py detection_tensorrt_dynamic-320x320-1344x1344.py 0.95 0.97 0.87 300 √
... xxx.pth xxx.py - xxx xxx xxx xxx -

User Test

Example

python ./tools/accuracy_test.py \
    --deploy-cfg "configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py" \
    --model-cfg "$PATH_TO_MMDET/configs/yolo/yolov3_d53_mstrain-608_273e_coco.py" \
    --checkpoint "$PATH_TO_MMDET/checkpoints/yolo/yolov3_d53_mstrain-608_273e_coco.pth" \
    --dataset-path "${PATH_TO_MMDET}/data/coco/val2017" \
    --work-dir "${WORK_DIR}" \
    --show \
    --device cuda:0

Step

To do so, the script will do the follow step to complete:

  1. Use the model_cfg to load the test_pipeline to test the dataset which will show the baseline of the model, which I name it baseline A.
  2. Script convert the model according to the deploy_cfg, use tensorrt fp16 for example, we can get the A_trt_fp16.engine after this.
  3. Like step 1, using the test_pipeline to run the test, then we can have the baseline B
  4. Show the result of compareson of baseline A and baseline B using the terminal, like
model_cfg hmean precision recall FPS
yolov3_d53_mstrain-608_273e_coco.py 0.95 0.98 0.88 30
detection_tensorrt_dynamic-320x320-1344x1344.py 0.95 0.97 0.87 300