Open torukskywalker opened 7 months ago
The quantizers (e.g. quantized_bits
) have a .scale
attribute. So you can get the scale after running some inference by just getting the .scale
attribute.
Note that scaling factors can be separate for each neuron, or in convolutional networks they can be different for every channel. So in general, the scale attribute returns an array of scaling factors.
https://github.com/cs-jsi/chisel4ml/blob/f78192562fe00633a9f5320e93001dc0fd802a2f/chisel4ml/transforms/qkeras_util.py#L358
Here is an example function that returns the scaling factors from quantizers.
a navie question, for INT8 quantization using Mnist example, is the Input divided by 255 (normilized to [0 1] ) or not ?
Well, I usually don't divide the input to normalize it 0 to 1. That way the inputs are integer. You could divide it, and that way you will get non-integer inputs, that on PC can work with floating-point. Then if you want to do deploy this to some custom hardware with out floating-point units, you can use fixed-point. With regards on how this affects training, in my experience it does not, at least for (relatively) shallow networks I usually train.
when doing the Inference on edge device with INT8 weights/bias, the accuracy is very low,
after tracing the C++ code step by step, we found the error is caused by the Softmax after the last Dense layer.
The Softmax may output INF since there are some "larger" integer numbers,
We tried to divide the output of Dense layer by eg 255 but did not help.
Is there a solution or replacement of Softmax in Qkeras for INT8 training ?
(so we could also replace current Softmax in Inference for INT8 weights)
Hm. This is a bit unusual. I am not exactly sure what is causing your issues, but in general you can also skip softmax in inference, if you are only using it for picking the best class.
Regarding large integers. One thing you could do (if you are not doing it yet) is to saturate the intermediate activations. E.g. have a ReLU with saturation.
There is a related issue #111,
But, I'm still confused about how to get the scale value after full integer quantization , any one knows how to get it ?