Closed Tetsujinfr closed 3 years ago
@Tetsujinfr rvm_mobilenetv3_fp16.torchscript是一定要在gpu上跑才行吗,为啥用cpu跑这个模型有很多错误,float32就没问题
I do not speak chinese. Based on a translation, I am not sure I get your point. I am inferencing on gpu here, not cpu, even if the model is named "mobile", it is just a smaller size network but still executes on GPU no?
Well, when you inferenced the model with float16, which format was used?.onnx or .torchscript I use the distributed rvm_mobilenetv3_fp16.torchscript on github,the errors is as following, "compute_indices_weights_linear" not implemented for 'Half' "unfolded2d_copy" not implemented for 'Half'
hi tx for this repo, really cool!
I ave a try to the inference_speed.py on a 3090 and I got almost the same results : 169fps for both fp32 and 168fps @ HD res with downsample 0.25, on mobilenetv3.
Should I not get significantly faster speed on fp16?
Edit: I can see a difference of 15% on resenet50, so I guess that has to deal with the nature of the pre-trained models used.