Open xiaoxiongli opened 3 years ago
Hello, xiaoxiong: It runs on CPU so it's really slow. I and other paricipants also encounter the same problem.
Hi, NJU-Jet thank you for you reply!
so you mean that this model is suitable for run on mobile device's GPU (not CPU) ?
do you mean that it is fast on mobile device's GPU and I'd better to test it on your AI benchmark app(using mobile device's GPU)?
I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.
I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.
The desktop CPU is not optimized for integer operations, so the speed of Tflite is very slow. Tflite can run fast on mobile devices such as arm devices. If you want to improve the speed, you can use the following methods if your tensorflow version is 2.x:
interpreter = tf.lite.Interpreter(model_path=quantized_model_path, num_threads = num_threads)
where 'num_threads' can be adjusted according to the number of cpu cores, for example num_threads = 16. I hope my answer will be helpful to you guys.
Thank you so much!
该邮件从移动设备发送
I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.
The desktop CPU is not optimized for integer operations, so the speed of Tflite is very slow. Tflite can run fast on mobile devices such as arm devices. If you want to improve the speed, you can use the following methods if your tensorflow version is 2.x:
interpreter = tf.lite.Interpreter(model_path=quantized_model_path, num_threads = num_threads)
where 'num_threads' can be adjusted according to the number of cpu cores, for example num_threads = 16. I hope my answer will be helpful to you guys.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
Hi, Dear NJU-Jet
my linux server: several 2.6GHz CPU + several V100, and I run the generate_tflite.py to got a quantized model.
and then in function evaluate, I add below code to measure the inference time:
and it seems the inference time is very slow, it cost about 70 seconds per image.
I wonder that this inference is run on cpu or gpu? and why it is so slow?
thank you very much!