Reproduce Table 1 by this repo?

hao-ai-lab / Consistency_LLM

[ICML 2024] CLLMs: Consistency Large Language Models

http://arxiv.org/abs/2403.00835

Apache License 2.0

358 stars 18 forks source link

Reproduce Table 1 by this repo? #6

Closed dreaming-panda closed 6 months ago

dreaming-panda commented 6 months ago

Hello, thanks for your nice work.

I want to reproduce table1 (i.e. the accuracy of GSM8k and ShareGPT) but I cannot find scripts to do this.

Could you point me a way?

Thank you!

snyhlxde1 commented 6 months ago

Thanks for your interests in our work! For table 1, We follow the same settings in human-eval, Spider, MT-bench and GSM8K to evaluate CLLMs' generation quality, but with Jacobi decoding instead of conventional AR decoding.

Output generation code for gsm8k and ShareGPT (using Jacobi decoding) have just been upload under the eval/gsm8k and eval/mt-bench directory. You can use the script to generate outputs and follow MT-bench's instruction to complete the evaluation.

dreaming-panda commented 6 months ago

Thank you for your patient response!

agentup commented 6 months ago

Thanks for your good work.

I tried to run the gsm8k scripts to reproduce Table 1 results. However, I got the final results as shown in the figure.

The performance of CLLM with Jacobi is much lower.

Do you have any idea where I have made a mistake. Thanks a lot.

snyhlxde1 commented 6 months ago

@agentup could you provide some more information about your settings? what hardware are you running on and what command arguments did you use?

snyhlxde1 commented 6 months ago

I noticed you are using max_new_tokens= 512, this could be the cause of why you are not getting speedup: you are using a n-token sequence of size 512 for iteration and it could introduce a lot of compute overhead. For GSM8K, please change max_new_tokens to 16 or 32 for a good speedup.

Notice that in this repo, the command arguments has the following meaning: max_new_tokens: n-token sequence size for Jacobi trajectory generation. max_seq_len: your total model generation length.

agentup commented 6 months ago

I noticed you are using max_new_tokens= 512, this could be the cause of why you are not getting speedup: you are using a n-token sequence of size 512 for iteration and it could introduce a lot of compute overhead. For GSM8K, please change max_new_tokens to 16 or 32 for a good speedup.

Notice that in this repo, the command arguments has the following meaning: max_new_tokens: n-token sequence size for Jacobi trajectory generation. max_seq_len: your total model generation length.

Thanks for your help. The problem has been solved.