changtimwu / changtimwu.github.com

Tim's testing/practice notes
7 stars 2 forks source link

TensorRT and inference benchmarking #104

Open changtimwu opened 4 years ago

changtimwu commented 4 years ago

official resnet50

yolov3

changtimwu commented 4 years ago

Benchmark result

fp16

bin/trtexec --avgRuns=100 --deploy=resnet50.prototxt --fp16 --batch=8 --iterations=10000 --output=prob --useSpinWait

t4bm_fp16_resnet50.txt

int8

bin/trtexec --avgRuns=100 --deploy=resnet50.prototxt --int8 --batch=8 --iterations=10000 --output=prob --useSpinWait

t4bm_int8_resnet50.txt

changtimwu commented 4 years ago

not sure if qps is equivlent to fps. The internal variable is latencyThroughtput. q means query. https://github.com/NVIDIA/TensorRT/blob/master/samples/common/sampleReporting.cpp#L153

changtimwu commented 4 years ago

worth to read the How Do I Measure Performance? section. The overall system performance can be measured by the latency and throughput of the entire processing pipeline. Because the pre and post-processing steps depend so strongly on the particular application, in this section, we will mostly consider the latency and throughput of the network inference, excluding the data pre and post-processing overhead.

changtimwu commented 4 years ago

check tensorrt examples.

dpkg -L libnvinfer-samples 

checkpoint to PB

# Step 1: export model
!python model_inspect.py --runmode=saved_model \
  --model_name=efficientdet-d0 --ckpt_path=efficientdet-d0 \
  --saved_model_dir=/tmp/saved_model

root@55b6f8ad558e:~/workspace/automl/efficientdet# ls -ahR /tmp/saved_model/
/tmp/saved_model/:
.  ..  saved_model.pb  variables

/tmp/saved_model/variables:
.  ..  variables.data-00000-of-00001  variables.index
changtimwu commented 4 years ago

let's evaluate fps in this way

https://github.com/jkjung-avt/tensorrt_demos#ssd

changtimwu commented 4 years ago

Pretrained model inference with TensorRT

changtimwu commented 4 years ago

https://zhuanlan.zhihu.com/p/88318324

changtimwu commented 4 years ago

tensort acceleration

GTC 2020

Blog