Tencent / ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform
Other
20.47k stars 4.17k forks source link

Quantized model returns different results compared to float32 model #3610

Closed jinkilee closed 2 years ago

jinkilee commented 2 years ago

I am using faceboxes.param and faceboxes.bin for my face recognition model, and I want to quantize them.

To do this,

  1. ncnnoptimize faceboxes.param faceboxes.bin faceboxes-opt.param faceboxes-opt.bin 0
  2. ncnn2table faceboxes-opt.param faceboxes-opt.bin imagelist-widerface.txt faceboxes.table mean=[104,117,123] norm=0 shape=[300,300,3] pixel=BGR thread=8 method=kl -> my model takes 300,300,3 for inputs. mean and norm were double-checked many times.
  3. ncnn2int8 faceboxes-opt.param faceboxes-opt.bin faceboxes-int8.param faceboxes-int8.bin faceboxes.table
  4. Got faceboxes-int8.param and faceboxes-int8.bin

When I inference using the int8 model, it gave me different result with float32 one. ex) float32 model detects a person but int8 one didn't for a same image.

Also I turned on "net.opt.use_int8_inference = true" when running int8 model.

Is there anything I am missing?

Also I did the same thing with mnet.25-opt.bin (downloaded from link) and got mnet.25-int.param and bin. However this one is slower than float32, while size of the model is around 1/2. Why quantized model is slower than float32 model?

weilanShi commented 2 years ago

I have encountered the same problem. Have you solved it ?