NVIDIA-AI-IOT / tf_trt_models

TensorFlow models accelerated with NVIDIA TensorRT
BSD 3-Clause "New" or "Revised" License
684 stars 244 forks source link

TX2 slower than reported #13

Open Fred-Erik opened 6 years ago

Fred-Erik commented 6 years ago

Hello Nvidia!

Thank you for the clear explanation and benchmarking on this website, and for testing out different models, it is really appreciated! According to your execution time table, I should get 54.4ms when running ssd_inception_v2_coco on the TX2. Over 200 runs, after the network is 'warmed up', I get 69.63ms. When looking at tegra_stats, it seems that the GPU is not very efficiently utilized (even though it varies over time, it is rarely even close to 90%):

RAM 4167/7854MB (lfb 84x4MB) CPU [49%@2035,0%@2035,0%@2035,44%@2035,47%@2032,38%@2035] EMC_FREQ 7%@1866 GR3D_FREQ 18%@1300 APE 150 MTS fg 0% bg 0% BCPU@48C MCPU@48C GPU@47.5C PLL@48C Tboard@41C Tdiode@46.25C PMIC@100C thermal@48.3C VDD_IN 7862/4839 VDD_CPU 1763/820 VDD_GPU 2531/947 VDD_SOC 997/929 VDD_WIFI 0/33 VDD_DDR 1626/1271

I just followed all the steps on the Github readme and the notebook , so any idea what could be the cause of this? I use Jetpack 3.3 and Tensorflow 1.10.

Edit: see the Nvidia forums for more on this issue