Closed patelprateek closed 11 months ago
If you have a matrix with two dimensions, you can normalize it into the range [-1, 1] by taking the absolute maximum over either rows or columns. In matrix multiplication, A*B=C, you can do this for both matrices A and B.
Let me know if it is still unclear and I will try to explain it in different terms.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Hi , Was going through some impressive work here you have done : https://arxiv.org/pdf/2208.07339.pdf
Few naive questions : In table 1 we compare different type of quantization, could you help elaborate a bit on the difference for example what is
row vs vector quantization
, i could not find any pointers to how does a row quantization differs from vector or the terminology ? SimilarlyInt8 absmax vs Int8 absmax row-wise
? does it mean in the former we take the absmax over the entire matrix/tensor where as in row-wise we take max for each row ? What about Int8 absmax vectorwise ?Thanks again