deepseek-ai / DeepSeek-Math

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
MIT License
821 stars 51 forks source link

how to sample 64 output from old policy model? #14

Open mohhao opened 7 months ago

mohhao commented 7 months ago

Is it just adjusting the decoding parameters?

Wangpeiyi9979 commented 7 months ago

Yes, just set the temperature to 1.

mohhao commented 7 months ago

Yes, just set the temperature to 1.

and just change random seed?