Closed dacorvo closed 2 months ago
This makes sure QTensor linear operations using optimized kernels are giving the same results as those using dequantized weights.
What does this PR do?
This makes sure QTensor linear operations using optimized kernels are giving the same results as those using dequantized weights.