I tried to reproduce the results of paper and got next problem.
I tested Llama2-7b 4-16-16 (RTN) with 10_optimize_rotation.sh and got wikitext-2 ppl 5.5, which is exact same as paper reported.
BUT, when I tested zero-shot task with lm_eval==0.4.3,
The results is as follows.
ARC-e
ARC-c
PIAQ
HellaSwag
Winogrande
Paper
72.2
48.6
78.2
74.2
67.9
Mine
70.6
42.7
77.7
74.6
67.3
I think there's discrepancy in zero-shot results. Only difference is that paper results are tested on 8 x A100 and I tested on 8 x A6000.
If the script 10_optimize_rotation.sh is not the exact script for repoducing, can I get exact script?
Thanks.
Thanks for sharing great research.
I tried to reproduce the results of paper and got next problem. I tested Llama2-7b 4-16-16 (RTN) with
10_optimize_rotation.sh
and got wikitext-2 ppl 5.5, which is exact same as paper reported.I think there's discrepancy in zero-shot results. Only difference is that paper results are tested on 8 x A100 and I tested on 8 x A6000. If the script
10_optimize_rotation.sh
is not the exact script for repoducing, can I get exact script? Thanks.