FPS of unet segmentation is lower than expected

I am using Jetson NX and trying to speed up the segmentation model(unet-mobilenet-512x512). I converted my tensorflow model to tensorRT with FP16 precision mode. And the speed is lower than I expected. Before the optimization i had 7FPS on inference with .pb frozen graph. After tensorRT oprimization I have 14FPS.

I have runned the benchmark on my Jetson NX, and unet 256x256 segmentation speed (speed of those .uff model, that is provided in repo) is really 146 FPS. I thought, the speed of my unet512x512 should be 4 times slower in the worst case. Maybe I should run inference in other way(without tensorflow) or change some converting parameters? Is it possible to find the pipeline with end-to-end way for optimizing and inferencing tensorflow models? I am looking for the solutions to get speed of my model(unet-mobilenet-512x512) close to 30FPS

NVIDIA-AI-IOT / jetson_benchmarks

FPS of unet segmentation is lower than expected #13