Irwin-Liu / hfnet-tf2onnx

Change HFNet trained model from Tensorflow to ONNX
11 stars 4 forks source link

> I have test HFNet with TF-TRT on Tensorflow 1.x and Tensorflow 2 #3

Open YJC-666 opened 2 weeks ago

YJC-666 commented 2 weeks ago
          > I have test HFNet with TF-TRT on Tensorflow 1.x and Tensorflow 2

When using TensorFlow 1.x, dynamics_op must set to true and it takes long time of initialization. When using Tensorflow 2, we can use data to build the engine and save it. Here is my benchmark Platform Accuracy Device Size Inference Time Loading Time TensorRT@TF1 FP16 TITAN 400x208 8.49 ms ± 151 µs
TensorRT@TF1 FP32 TITAN 400x208 8.29 ms ± 37.3 µs
TensorFlow1.x FP32 TITAN 400x208 12.1 ms ± 1.89 ms
TensorRT @Tensorflow2 FP16/FP32 TITAN 400x208 6.95 ms ± 114 µs / 6.86 ms ± 48.3 µs
TenorRT@TF1 Fp16 TX2 400x208 28.3 ms ± 166 µs(100kps) 29.1 ms ± 883 µs(200kps) 19.35(Graph Loading) + 240.52s TensorRT@TF1 FP32 TX2 400x208 31.3(100kps) 31.6(200kps)
TensorFlow1.x - TX2 400x208 67.1ms(200kps)
TensorRT@TF2 FP32 TX2 400x208 37.5 ms ± 196 µs per loop - TensorRT@TF2 FP16 TX2 400x208 34.5 ms ± 140 µs per loop 68.5s(Graph Loading) + 22.08s TensorFlow2 Fp32 TX2 400x208 100KPS 49.13ms@100KPS 49.26ms@200KPs Test on real data, stable, no more than 100ms case TensorRT@TF2 FP16 TX2 400x208@100kps 119.43ms Inference time not stable if lack of features; Error may cause by zero vectors. TensorRT@TF1 FP16 TX2 400x208@100 kps 66ms Dynamics OP=False On real data TF1.x FP32 TX2 400x208 67.39ms@100KPS 67.39ms@200KPS 67.27ms@200KPS, 0.15mem Real Data

However, the TF-TRT acceleration for HFNet is not stable. I found when there exist image lack of feature points, the TF-TRT will detects zero dim tensors and back to Tensorflow, the procedure is really slow, sometime takes few seconds in practial. You can found on my reaI data it become really slow. I also test to cut the HFNet to Superpoint and NetVLAD (cut is to keep the consistency instead of directly use Superpoint and HFNet). Tensorflow2 SuperPoint@HFNet FP32 TX2 400x208 38ms@200kps Real Data First Try 4sec TensorFlow2 NetVLAD@HFNet FP32 TX2 400x208 25.49ms RealData First Try 8.4sec TensorRT@TensorFlow2 NetVLAD@HFNet FP16 TX2 400x208 12.73ms RealData; First Try 5.57sec TensorRT@TensorFlow NetVLAD@HFNet FP32 TX2 400x208 12.75ms Real Data; First Try 5.5sec TensorRT@TensortFlow SuperPoint@HFNet FP16 TX2 400x208 26~27ms Only Fast case; Not pratical now. First Try 27ms. Become very big ~400ms sometimes

Thanks a lot! I'm going to take a try asap.

You also mentioned tf2 can build a engine, is it directly loaded by TFTRT API? If so, I guess maybe HFNet can convert to TRT engine under TF2.0 and other necessary tools. I will also try it.

Keep in touch :)

Originally posted by @Irwin-Liu in https://github.com/Irwin-Liu/hfnet-tf2onnx/issues/1#issuecomment-709180357

YJC-666 commented 2 weeks ago

TensorFlow 2.x能够转化的代码能希望提供一下吗,我希望在更新的设备上使用