Open XapaJIaMnu opened 3 years ago
For such use cases, we typically have the matmul output raw int32
accumulators, then we do a pass outside of the matmul library converting those to float
.
In gemmlowp, you get raw int32
accumulators simply by passing an empty output_pipeline
, as in this part of the test:
https://github.com/google/gemmlowp/blob/fda83bdc38b118cc6b56753bd540caa49e570745/test/test.cc#L1211-L1230
May I suggest taking a look at the ruy library instead of gemmlowp. It's basically gemmlowp's successor, it's what TFLite has been using by default on ARM for 18 months now, it supports both float and quantized, any combination of int8 and uint8, zero point or not and more quantization flavor variations. I've added an example for getting raw int32
accumulators.
https://github.com/google/ruy/blob/878283640de7946a43053e8ebf4f15114fbc9156/example/example.cc#L129-L152
@bjacob thank you that will do nicely. I think I'll use RUY.
Looking at the test, as far as I can see, only i8_i8_i32_i32
is supported, no i8_i8_i32_f32
, so I'd have to do the float conversion outside of the multiply, correct?
Yes, exactly.
Hey,
I'm looking to perform
int8 * int8 -> fp32
. where at the output stage I dequantise theint32_t
result intofloat
(and then potentially add a bias. I was following the example from https://github.com/google/gemmlowp/blob/master/doc/quantization_example.cc#L305 But it seems that in order to unquantise tofloat
you compute the quantisation parameters from the fp32 result that you had already computed before, which in practise I wouldn't know. I can compute it with a compensation factor, but it becomes incredibly complicated and computationally (and memory) expensive. Any alternatives?If I am able to assume quantisation into
int8
as opposed touint8
as in the example, I would be able to have quantisation without the zero_point parameter (assuming zero cantered distribution) which would massively simplify dequantisation. Do you support this? Do you have any examples in the codebase where something like this is done?