Open xxy90 opened 3 years ago
Hi xxy90,
Thanks for reaching out!
Do you mind sharing which model architecture you're referring to? The relative performance of FP32 vs. FP16 may depend on model architecture. I think the scaling also might not be linear with bit depth, because of various overhead when using reduced precision.
Best, John
Hi John, I had the same problem which on Jetson Nano, weather in converting,fp16=True or False. the speed is the same. I used the model of lightweight openpose,here is the link, https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch
The model architecture is new anchor-free object-detection Nanodet, whose backbone is shuffleNetv2
I also meet the same problem while using YOLOX-Nano(ref https://github.com/Megvii-BaseDetection/YOLOX) on jetson nano.
Hi xxy90,
Thanks for reaching out!
Do you mind sharing which model architecture you're referring to? The relative performance of FP32 vs. FP16 may depend on model architecture. I think the scaling also might not be linear with bit depth, because of various overhead when using reduced precision.
Best, John