google / qkeras

QKeras: a quantization deep learning library for Tensorflow Keras
Apache License 2.0
532 stars 102 forks source link

Get scale value after Full Integer Quantization #132

Open torukskywalker opened 5 months ago

torukskywalker commented 5 months ago

There is a related issue #111,
But, I'm still confused about how to get the scale value after full integer quantization , any one knows how to get it ?

jurevreca12 commented 5 months ago

The quantizers (e.g. quantized_bits) have a .scale attribute. So you can get the scale after running some inference by just getting the .scale attribute. Note that scaling factors can be separate for each neuron, or in convolutional networks they can be different for every channel. So in general, the scale attribute returns an array of scaling factors. https://github.com/cs-jsi/chisel4ml/blob/f78192562fe00633a9f5320e93001dc0fd802a2f/chisel4ml/transforms/qkeras_util.py#L358 Here is an example function that returns the scaling factors from quantizers.

torukskywalker commented 5 months ago

a navie question, for INT8 quantization using Mnist example, is the Input divided by 255 (normilized to [0 1] ) or not ?

jurevreca12 commented 5 months ago

Well, I usually don't divide the input to normalize it 0 to 1. That way the inputs are integer. You could divide it, and that way you will get non-integer inputs, that on PC can work with floating-point. Then if you want to do deploy this to some custom hardware with out floating-point units, you can use fixed-point. With regards on how this affects training, in my experience it does not, at least for (relatively) shallow networks I usually train.

torukskywalker commented 5 months ago

when doing the Inference on edge device with INT8 weights/bias, the accuracy is very low, after tracing the C++ code step by step, we found the error is caused by the Softmax after the last Dense layer. The Softmax may output INF since there are some "larger" integer numbers, We tried to divide the output of Dense layer by eg 255 but did not help. Is there a solution or replacement of Softmax in Qkeras for INT8 training ?
(so we could also replace current Softmax in Inference for INT8 weights)

jurevreca12 commented 5 months ago

Hm. This is a bit unusual. I am not exactly sure what is causing your issues, but in general you can also skip softmax in inference, if you are only using it for picking the best class.

Regarding large integers. One thing you could do (if you are not doing it yet) is to saturate the intermediate activations. E.g. have a ReLU with saturation.