int8*int8 -> float? - Githubissues

XapaJIaMnu commented 3 years ago

Hey,

I'm looking to perform int8 * int8 -> fp32. where at the output stage I dequantise the int32_t result into float (and then potentially add a bias. I was following the example from https://github.com/google/gemmlowp/blob/master/doc/quantization_example.cc#L305 But it seems that in order to unquantise to float you compute the quantisation parameters from the fp32 result that you had already computed before, which in practise I wouldn't know. I can compute it with a compensation factor, but it becomes incredibly complicated and computationally (and memory) expensive. Any alternatives?

If I am able to assume quantisation into int8 as opposed to uint8 as in the example, I would be able to have quantisation without the zero_point parameter (assuming zero cantered distribution) which would massively simplify dequantisation. Do you support this? Do you have any examples in the codebase where something like this is done?

bjacob commented 3 years ago

For such use cases, we typically have the matmul output raw int32 accumulators, then we do a pass outside of the matmul library converting those to float.

In gemmlowp, you get raw int32 accumulators simply by passing an empty output_pipeline, as in this part of the test: https://github.com/google/gemmlowp/blob/fda83bdc38b118cc6b56753bd540caa49e570745/test/test.cc#L1211-L1230

May I suggest taking a look at the ruy library instead of gemmlowp. It's basically gemmlowp's successor, it's what TFLite has been using by default on ARM for 18 months now, it supports both float and quantized, any combination of int8 and uint8, zero point or not and more quantization flavor variations. I've added an example for getting raw int32 accumulators. https://github.com/google/ruy/blob/878283640de7946a43053e8ebf4f15114fbc9156/example/example.cc#L129-L152

XapaJIaMnu commented 3 years ago

@bjacob thank you that will do nicely. I think I'll use RUY.

Looking at the test, as far as I can see, only i8_i8_i32_i32 is supported, no i8_i8_i32_f32, so I'd have to do the float conversion outside of the multiply, correct?

bjacob commented 3 years ago

Yes, exactly.

google / gemmlowp

int8*int8 -> float? #203