Open dedoogong opened 3 years ago
Hi @dedoogong ,
Thanks for reaching out!
Do you mind sharing the code you used to convert and benchmark the model? This will help me reproduce what you see.
Best, John
I have converted a semantic segmentation model from fp32 to fp16 using .half() and I see only 10% speedup and inference time did not reduce by half.
Can someone help here?
I have converted R2D2 model from https://github.com/naver/r2d2/blob/master/extract.py using fp16_mode=True on T4(Tensor Core supported). But it shows almost no speed up(just 10~14%). The model is quite simple(maybe even simpler than Resnet-18 or similar as VGG-18)
The converting was so simple. just insert torch2trt module and call it with the network. Can you please help me how to get the speed up result?