The LowerConvsToMatMul transformation previously only worked with Conv nodes that have a static initializer as the weight input. This PR extends the capabilities to deal with the case where the weights are fed by a Quant node, including any transpose/reshape required for the scale factors. See example below (before/after, from a 4-bit MobileNet-v1)
The
LowerConvsToMatMul
transformation previously only worked withConv
nodes that have a static initializer as the weight input. This PR extends the capabilities to deal with the case where the weights are fed by aQuant
node, including any transpose/reshape required for the scale factors. See example below (before/after, from a 4-bit MobileNet-v1)