Per channel quant - Githubissues

This is a preliminary version in order to get some feedback. The goal is adding support in GEMM with different kernel quantization parameters per output channel.

Changes:

Modified version of 4x8 gemm ukernel was added, to support kernel scale and zero-point per output channel.
Helper funcitons were added : weights-packing, computing requantization parameters
gemm-micro-kernel test function was added with corresponding unit-tests

pytorch / QNNPACK

Per channel quant #51