benchmark如何复现 - Githubissues

PaddlePaddle / PARL

A high-performance distributed training framework for Reinforcement Learning

https://parl.readthedocs.io/

Apache License 2.0

3.25k stars 820 forks source link

Closed Littlehong-ai closed 1 year ago

Littlehong-ai commented 1 year ago

请问各位大佬如何复现模型效果？代码中是不是没指定随机种子之类的

Littlehong-ai commented 1 year ago

例如，我每次运行benchmark/fluid/DQN下的train.py 都会得到不一致的test_reward

TomorrowIsAnOtherDay commented 1 year ago

这个是符合预期的，RL在训练过程中需要随机探索，会有一定的随机性。如果想要稳定复现特定的效果，可以在程序执行前指定numpy、paddle/torch、random、gym的随机种子。

Littlehong-ai commented 1 year ago

这个是符合预期的，RL在训练过程中需要随机探索，会有一定的随机性。如果想要稳定复现特定的效果，可以在程序执行前指定numpy、paddle/torch、random、gym的随机种子。