PaddlePaddle / PaddleDetection

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Apache License 2.0
12.57k stars 2.86k forks source link

关于量化,量化之后测试出来的帧率反而更低了?? #2924

Closed dengxinlong closed 2 years ago

dengxinlong commented 3 years ago

image 上面是采用感知量化训练出来的结果。命令为: python slim/quantization/eval.py -c configs/ssd/ssdlite_mobilenet_v3_large_fpn.yml -o weights=experiment/best_model/

image 这个是没有量化之后的结果。

可以看见,采用感知量化的fps:27.16,没有量化之后的fps:35.34,帧率反而更低了。是我量化的方式不对吗??

yghstill commented 3 years ago

@dengxinlong 你直接eval,模型的精度还是fp32,是没有加速的,并且量化后精度下降,会导致nms耗时增加,预测速度反而降低了,量化后模型测速,建议你使用paddle-lite或tensorRT int预测:

dengxinlong commented 3 years ago

@dengxinlong 你直接eval,模型的精度还是fp32,是没有加速的,并且量化后精度下降,会导致nms耗时增加,预测速度反而降低了,量化后模型测速,建议你使用paddle-lite或tensorRT int预测:

这里的话我主要是想在jetson xavier nx上做量化,但是你们文档好像是针对android的啊

yghstill commented 3 years ago

@dengxinlong jetson上使用trt int8模式预测即可

dengxinlong commented 3 years ago

@dengxinlong jetson上使用trt int8模式预测即可

感谢您的回复,我看您给的文档中的是infer.py,但我想运行的是eval

dengxinlong commented 3 years ago

@dengxinlong jetson上使用trt int8模式预测即可

同时,主要是我想使用量化来测试模型,就是不知道paddlelite是否支持jetson xavier nx??

dengxinlong commented 3 years ago

@dengxinlong jetson上使用trt int8模式预测即可

我在jetson xavier nx上编译安装paddlelite是出现了问题。是不是说paddlelite无法在jetson xavier nx上安装啊? 报错信息: unrecognized command line option '-m16'

dengxinlong commented 3 years ago

@dengxinlong 你直接eval,模型的精度还是fp32,是没有加速的,并且量化后精度下降,会导致nms耗时增加,预测速度反而降低了,量化后模型测速,建议你使用paddle-lite或tensorRT int预测:

您说的在jetson xavier nx上直接使用tensorRT,但是我导出模型进行预测后,报错说不支持tensorRT int8。

Traceback (most recent call last):
  File "deploy/python/infer.py", line 601, in <module>
    main()
  File "deploy/python/infer.py", line 536, in main
    config, FLAGS.model_dir, use_gpu=FLAGS.use_gpu, run_mode=FLAGS.run_mode)
  File "deploy/python/infer.py", line 78, in __init__
    use_gpu=use_gpu)
  File "deploy/python/infer.py", line 397, in load_predictor
    raise ValueError("TensorRT int8 mode is not supported now, "
ValueError: TensorRT int8 mode is not supported now, please use trt_fp32 or trt_fp16 instead.
yghstill commented 3 years ago

@dengxinlong 请使用release/2.0或develop分支最新代码:https://github.com/PaddlePaddle/PaddleDetection/blob/develop/static/deploy/python/infer.py

dengxinlong commented 3 years ago

@dengxinlong 请使用release/2.0或develop分支最新代码:https://github.com/PaddlePaddle/PaddleDetection/blob/develop/static/deploy/python/infer.py

在release/2.0上能不能使用 release 2.0-rc上的配置文件(.yml文件)啊??

dengxinlong commented 3 years ago

@dengxinlong 请使用release/2.0或develop分支最新代码:https://github.com/PaddlePaddle/PaddleDetection/blob/develop/static/deploy/python/infer.py

有个重要的问题就是,能否用release/2.0中静态图的代码来使用 tensorRT int8??

yghstill commented 3 years ago

@dengxinlong 请使用release/2.0或develop分支最新代码:https://github.com/PaddlePaddle/PaddleDetection/blob/develop/static/deploy/python/infer.py

有个重要的问题就是,能否用release/2.0中静态图的代码来使用 tensorRT int8??

静态图代码可以使用 TensorRT int8

dengxinlong commented 3 years ago

@dengxinlong 你直接eval,模型的精度还是fp32,是没有加速的,并且量化后精度下降,会导致nms耗时增加,预测速度反而降低了,量化后模型测速,建议你使用paddle-lite或tensorRT int预测:

哎,你们这边的文档能不能稍微写清楚一点啊,感觉很乱!!!!!!!

dengxinlong commented 3 years ago

@dengxinlong 请使用release/2.0或develop分支最新代码:https://github.com/PaddlePaddle/PaddleDetection/blob/develop/static/deploy/python/infer.py

有个重要的问题就是,能否用release/2.0中静态图的代码来使用 tensorRT int8??

静态图代码可以使用 TensorRT int8

您好,之前按照你的建议,使用了deploy/python/infer.py 不过使用的是static下的。 运行输出:

(test) coded@coded-desktop:~/PaddleDetection/static$ python deploy/python/infer.py --model_dir=../../PaddleDetection_old/bestModel/ssdlite_mobilenet_v3_large_fpn/ --image_file=1478896904942573873.jpg  --use_gpu=True --threshold=0.2 --run_mode=trt_fp16
WARNING: AVX is not support on your machine. Hence, no_avx core will be imported, It has much worse preformance than avx core.
/home/coded/.local/virtualenvs/test/lib/python3.6/site-packages/paddle/utils/cpp_extension/extension_utils.py:461: UserWarning: Not found CUDA runtime, please use `export CUDA_HOME= XXX` to specific it.
  "Not found CUDA runtime, please use `export CUDA_HOME= XXX` to specific it."
-----------  Running Arguments -----------
camera_id: -1
image_file: 1478896904942573873.jpg
model_dir: ../../PaddleDetection_old/bestModel/ssdlite_mobilenet_v3_large_fpn/
output_dir: output
run_benchmark: False
run_mode: trt_fp16
threshold: 0.2
use_gpu: True
video_file: 
------------------------------------------
-----------  Model Configuration -----------
Model Arch: SSD
Use Paddle Executor: False
Transform Order: 
--transform op: Resize
--transform op: Normalize
--transform op: Permute
--------------------------------------------
W0513 20:54:04.807541 18561 analysis_predictor.cc:1145] Deprecated. Please use CreatePredictor instead.
Inference: 57.614803314208984 ms per batch image
class_id:1, confidence:0.4840,left_top:[1101.23,423.78], right_bottom:[1142.89,499.94]
class_id:4, confidence:0.6002,left_top:[1218.22,574.18], right_bottom:[1377.13,638.06]
class_id:4, confidence:0.5065,left_top:[359.96,611.97], right_bottom:[460.14,656.98]
class_id:4, confidence:0.3243,left_top:[650.40,604.32], right_bottom:[697.91,646.41]
class_id:4, confidence:0.2525,left_top:[746.66,601.77], right_bottom:[787.74,639.54]
save result to: output/1478896904942573873.jpg

上面是trt_fp16,但模型是经过感知量化之后的。

下面是trt_int8的输出:

(test) coded@coded-desktop:~/PaddleDetection/static$ python deploy/python/infer.py --model_dir=../../PaddleDetection_old/bestModel/ssdlite_mobilenet_v3_large_fpn/ --image_file=1478896904942573873.jpg  --use_gpu=True --threshold=0.2 --run_mode=trt_int8
WARNING: AVX is not support on your machine. Hence, no_avx core will be imported, It has much worse preformance than avx core.
/home/coded/.local/virtualenvs/test/lib/python3.6/site-packages/paddle/utils/cpp_extension/extension_utils.py:461: UserWarning: Not found CUDA runtime, please use `export CUDA_HOME= XXX` to specific it.
  "Not found CUDA runtime, please use `export CUDA_HOME= XXX` to specific it."
-----------  Running Arguments -----------
camera_id: -1
image_file: 1478896904942573873.jpg
model_dir: ../../PaddleDetection_old/bestModel/ssdlite_mobilenet_v3_large_fpn/
output_dir: output
run_benchmark: False
run_mode: trt_int8
threshold: 0.2
use_gpu: True
video_file: 
------------------------------------------
-----------  Model Configuration -----------
Model Arch: SSD
Use Paddle Executor: False
Transform Order: 
--transform op: Resize
--transform op: Normalize
--transform op: Permute
--------------------------------------------
W0513 19:59:15.472364 18171 analysis_predictor.cc:1145] Deprecated. Please use CreatePredictor instead.
Inference: 33687.761545181274 ms per batch image
class_id:1, confidence:0.4905,left_top:[1101.20,423.82], right_bottom:[1142.83,499.95]
class_id:4, confidence:0.6004,left_top:[1218.18,574.16], right_bottom:[1377.05,638.02]
class_id:4, confidence:0.5076,left_top:[359.73,611.98], right_bottom:[459.73,656.93]
class_id:4, confidence:0.3248,left_top:[650.41,604.29], right_bottom:[697.92,646.41]
class_id:4, confidence:0.2520,left_top:[746.22,601.75], right_bottom:[787.28,639.51]
save result to: output/1478896904942573873.jpg

trt_fp16的推理时间为57.6ms,而trt_int8的推理时间为30000多ms,这实在不合理啊。 问题:

  1. 两者差别太大,并且我的模型是经过感知量化之后的,为什么?
  2. 在对模型训练完之后,使用tools/eval.py进行评估的时候,fps在30多,但这里推理的时间是不是太长了,不管是哪种方式,是不是计算方式不一致? 环境:jetson xavier nx,测试模型:ssdlite-mobilenetv3_large_fpn,数据集:自定义数据集
thorory commented 3 years ago

同问: 我在jetson xavier nx上面使用darknet53模型,量化感知后使用trt_int8帧率比量化感知前使用trt_fp16慢一倍左右,配置使用的是config/slim里面的默认配置

paddle-bot-old[bot] commented 2 years ago

Since this issue has not been updated for more than three months, it will be closed, if it is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. It is recommended to pull and try the latest code first. 由于该问题超过三个月未更新,将会被关闭,若问题未解决或有后续问题,请随时重新打开(建议先拉取最新代码进行尝试),我们会继续跟进。