it seems the inference is very slow on my linux server?

xiaoxiongli commented 3 years ago

Hi, Dear NJU-Jet

my linux server: several 2.6GHz CPU + several V100, and I run the generate_tflite.py to got a quantized model.

and then in function evaluate, I add below code to measure the inference time:

and it seems the inference time is very slow, it cost about 70 seconds per image.

I wonder that this inference is run on cpu or gpu? and why it is so slow?

thank you very much!

NJU-Jet commented 3 years ago

Hello, xiaoxiong: It runs on CPU so it's really slow. I and other paricipants also encounter the same problem.

xiaoxiongli commented 3 years ago

Hi, NJU-Jet thank you for you reply!

so you mean that this model is suitable for run on mobile device's GPU (not CPU) ?

do you mean that it is fast on mobile device's GPU and I'd better to test it on your AI benchmark app(using mobile device's GPU)?

NJU-Jet commented 3 years ago

I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.

duanshengliu commented 2 years ago

I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.

The desktop CPU is not optimized for integer operations, so the speed of Tflite is very slow. Tflite can run fast on mobile devices such as arm devices. If you want to improve the speed, you can use the following methods if your tensorflow version is 2.x:

interpreter = tf.lite.Interpreter(model_path=quantized_model_path, num_threads = num_threads)

where 'num_threads' can be adjusted according to the number of cpu cores, for example num_threads = 16. I hope my answer will be helpful to you guys.

NJU-Jet commented 2 years ago

Thank you so much！

该邮件从移动设备发送

--------------原始邮件-------------- 发件人："duanshengliu @.>; 发送时间：2022年5月24日(星期二) 晚上11:53 收件人："NJU-Jet/SR_Mobile_Quantization" @.>; 抄送："杜宗财 @.>;"Comment @.>; 主题：Re: [NJU-Jet/SR_Mobile_Quantization] it seems the inference is very slow on my linux server? (#9)

I am also confused about why tflite models run so slowly on desktop CPU (not only for this model, but for all other models). It's fast on mobile device.

The desktop CPU is not optimized for integer operations, so the speed of Tflite is very slow. Tflite can run fast on mobile devices such as arm devices. If you want to improve the speed, you can use the following methods if your tensorflow version is 2.x:

interpreter = tf.lite.Interpreter(model_path=quantized_model_path, num_threads = num_threads)

where 'num_threads' can be adjusted according to the number of cpu cores, for example num_threads = 16. I hope my answer will be helpful to you guys.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

NJU-Jet / SR_Mobile_Quantization

it seems the inference is very slow on my linux server? #9