Our quantization scheme relies on having integer kernels for a few key ops. While torch mm/conv ops are technically defined for integer operands, in practice, the type matrix is very sparsely implemented across backends. As such, we just define IREE specific ops for these and use them.
Unlike mm/conv, torch does not have a good pooling sum operator, instead using avg_pool with a denominator override (only defined for FP). Therefore, we provide that too.
Our quantization scheme relies on having integer kernels for a few key ops. While torch mm/conv ops are technically defined for integer operands, in practice, the type matrix is very sparsely implemented across backends. As such, we just define IREE specific ops for these and use them.
Unlike mm/conv, torch does not have a good pooling sum operator, instead using avg_pool with a denominator override (only defined for FP). Therefore, we provide that too.