Add C functions for MatMul over Int-3 Quant and Int-4 with different bin-sizes

Both int3 and int4 with bin-size greater than 32 would need empirical validation beyond Int-4 LLaMa is not enough as that study only assumed quantization of weight matrix, and not of intermediate representation as well.

It could be likely that the quality worsens when intermediate representations is also quantization to same extent of larger bin size, as that of the weight matrix. If this happens then intermediate representations will have to be quantized in a different manner than weights and new MatMul kernels will have to be added.

Originally suggested by @MarkSchmidty in https://github.com/NolanoOrg/cformers/issues/2#issuecomment-1475648776

NolanoOrg / cformers

Add C functions for MatMul over Int-3 Quant and Int-4 with different bin-sizes #12