deepseek-ai / DeepSeek-Math

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
MIT License
783 stars 46 forks source link

how to sample 64 output from old policy model? #14

Open mohhao opened 5 months ago

mohhao commented 5 months ago

Is it just adjusting the decoding parameters?

Wangpeiyi9979 commented 5 months ago

Yes, just set the temperature to 1.

mohhao commented 5 months ago

Yes, just set the temperature to 1.

and just change random seed?