Closed priscilla-pan closed 7 months ago
Hi,
Thanks for your interest in Atom. Sorry for being confusing in our codebase. But I believe the code logic of the accuracy part strictly follows the workflow shown in our paper (Check Fig.6), including dynamic activation quantization.
To be specific, Atom fuses the dynamic quantization operator into the previous element-wise operator, e.g. layer norm or activate function. You can check with code snippets like LayerNorm and Activate.
Hope this can solve your question.
Here is just the default constructor of our wrapper class Quantizer. We configure all quantization functions (in fact replace the lambda x: x) at here. Please check with this.
I have reproduced the results of llama-7b, the ppl of WikiText2 is the same as it in Table 3(6.16). But in the code you provide, the dynamic quantization for activations is not included. As far as I know, dynamic quantization also cause quantization error. Why omit dynamic quantization in your code?