SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MIT License
7.96k stars 412 forks source link

How can I get the same answer any time? #109

Open sunnyregion opened 10 months ago

sunnyregion commented 10 months ago

Prerequisites

Before submitting your question, please ensure the following:

Question Details

Please provide a clear and concise description of your question. If applicable, include steps to reproduce the issue or behaviors you've observed.

Additional Context

Please provide any additional information that may be relevant to your question, such as specific system configurations, environment details, or any other context that could be helpful in addressing your inquiry.

我执行下面的命令:

./build/bin/main -m ./models/ReluLLaMA-70B-PowerInfer-GGUF/llama-70b-relu.q4.powerinfer.gguf -n 128 -t 8 -p "介绍一 下上海交通大学。" 

每次得到的答案都不一样。 我希望每次得到的都一样。

I execute the following command:

./build/bin/main -m ./models/ReluLLaMA-70B-PowerInfer-GGUF/llama-70b-relu.q4.powerinfer.gguf -n 128 -t 8 -p "Can you tell me about Shanghai Jiao Tong University? " 

Every time I get a different answer. I want to get the same one any time.

hodlen commented 9 months ago

To achieve consistent answers, you can include --seed 0 --top-k 1 in your command line arguments. This approach can stabilize the generation results for short lengths; our tests indicate that a length of 32 is reproducible.

For more detailed information, you can refer to the documentation of examples/main available here: docs.