RuBP17 / AlphaDou

A Doudizhu reinforcement learning AI
GNU General Public License v3.0
9 stars 6 forks source link

训练参数问题 #2

Closed EvelynCarter closed 1 month ago

EvelynCarter commented 1 month ago

请问训练参数,需要调整吗,还在按照arguments中默认的参数进行训练,能够得到最佳模型 是否需要在训练中 随时调整策略,如有调整,可否告知 期待您的回复 能否告知py 和pytorch的版本

RuBP17 commented 1 month ago

默认参数就能训练。不过你也可以使用一些训练策略来获得更好的结果,比如说在训练的早期阶段使用更大的学习率、探索率,之后再缓慢减少这些值。

You can train AI using default parameters. Perhaps you need some training strategies to get a better results, such as using a larger learning rate and exploration rate during the early stages of training, and then gradually reducing them as the training progresses.

python version 3.9.12 pytorch version 2.0.0

EvelynCarter commented 1 month ago

之前采用0.001学习率,训练到32亿次的时候,测评模型的时候胜率下降幅度很大,是什么原因,没有即使调整学习率吗

RuBP17 commented 1 month ago

也许你需要考虑更小的探索率、学习率以及更大的batch size。

You may need to consider a lower exploration rate, a lower learning rate, perhaps a higher batch size.

EvelynCarter commented 1 month ago

也许你需要考虑更小的探索率、学习率以及更大的batch size。

您可能需要考虑较低的探索率、较低的学习率,或许还需要考虑较高的批量大小。

能否给一个针对4090的8卡的参数

RuBP17 commented 1 month ago

也许你需要考虑更小的探索率、学习率以及更大的batch size。 您可能需要考虑较低的探索率、较低的学习率,或许还需要考虑较高的批量大小。

能否给一个针对4090的8卡的参数

我逐步将学习率从1e-3降至了3e-5, cardplay model的探索率从0.1降至了0.01,bid model的探索率从0.3降至了0.03,仅供参考。

I gradually reduced the learning rate from 1e-3 to 3e-5, the exploration rate of the cardplay model from 0.1 to 0.01, and the exploration rate of the bid model from 0.3 to 0.03, for your reference.

EvelynCarter commented 1 month ago

大约时在多少亿开始进行降低的,或者说是有一个标准