how to sample 64 output from old policy model？

deepseek-ai / DeepSeek-Math

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

MIT License

821 stars 51 forks source link

Open mohhao opened 7 months ago

mohhao commented 7 months ago

Is it just adjusting the decoding parameters?

Wangpeiyi9979 commented 7 months ago

Yes, just set the temperature to 1.

mohhao commented 7 months ago

Yes, just set the temperature to 1.

and just change random seed？