the snippet output is different from what quantize conv expect

in the conv snippet, ctx.add(new RamTensor(), "out_conv_eightbit_quantized_conv:0", 2); ctx.add(new RamTensor(), "out_conv_eightbit_quantized_conv:1", 2); (output min) ctx.add(new RamTensor(), "out_conv_eightbit_quantized_conv:2", 2); (output max) ctx.push(new QntConvOp<uint8_t, uint8_t, float>({ 1, 2, 2, 1 }, VALID), { "x_quint8_const:0", "w_filter_quint8_const:0", "x_min:0", "x_max:0", "w_filter_min:0", "w_filter_max:0" }, { "out_conv_eightbit_quantized_conv:0", "out_conv_eightbit_quantized_conv:1", "out_conv_eightbit_quantized_conv:2" });

in matmul snippet, ctx.add(new RamTensor(), "z_eightbit_quantized_mat_mul:0", 2); ctx.add(new RamTensor({1}), "z_eightbit_quantized_mat_mul:1", 2); ctx.add(new RamTensor({1}), "z_eightbit_quantized_mat_mul:2", 2); ctx.push(new QntMatMulOp<uint8_t, uint8_t, int>(), { "w_quint8_const:0", "w_min:0", "w_max:0", "x_quint8_const:0", "x_min:0", "x_max:0" }, { "z_eightbit_quantized_mat_mul:0", "z_eightbit_quantized_mat_mul:1", "z_eightbit_quantized_mat_mul:2" });

The dim of output min and output max of Quantized MatMul is given by codegen , and it is as same as Quantized Conv2d. however, the second and third output does not have dim.

uTensor / utensor_cgen

the snippet output is different from what quantize conv expect #20