Closed fantasysee closed 4 months ago
Yew, you can skip the first three steps in the Readme's Usage section for the reproduction.
For weight-only quantization, we only use --lwc
for LLaMA, but use both --lwc
and --let
for OPT.
So , for reproduction the results with our checkpoint of OPT, you should activate both --lwc and --lwt. For example:
CUDA_VISIBLE_DEVICES=0 python main.py
--model /PATH/TO/OPT-1.3b
--epochs 0 --output_dir ./log/test
--eval_ppl --wbits 4 --abits 16 --group_size 128 --lwc --let
--resume /PATH/TO/opt-1.3b-w4a16g128.pth
I just ran and here's the result:
Thank you for your prompt assistance! I have successfully reproduced the OPT model results on my end. Appreciate your help!
Best regards, Chao
Dear authors,
Thank you for sharing your remarkable work.
I am currently focusing on replicating the evaluation results mentioned in your paper as part of our research efforts. While I've successfully matched results for llama-7b and llama-2-7b, I've encountered discrepancies in the OPT series results. Please see my attached pictures.
I'm using the OPT-1.3b model from https://huggingface.co/facebook/opt-1.3b with the following command line:
Could you confirm if the OPT models' base models match the ones used in your pre-trained models? If different, please specify.
Also, for replicating only the paper's results, I can skip the first three steps in the Readme's Usage section, Right?
Thank you for your time and assistance.
Best regards, Chao