Closed alexeyhorkin closed 2 years ago
Hey! I just thought this change might be helpful! Just for a better understanding of what time I can stop my long learning and be on the same scale as the graph and final_score variable.
final_score
Because right now, for some reason, we're doing the following:
final_score = evaluate( make_env(clip_rewards=False, seed=9), agent, n_games=30, greedy=True, t_max=10 * 1000 ) * 5 assert final_score >= 15, "not as cool as DQN can"
Suggestion:
final_score = evaluate( make_env(clip_rewards=False, seed=9), agent, n_games=30, greedy=True, t_max=10 * 1000 ) assert final_score >= 3, "not as cool as DQN can"
What do you think?
Also, I faced with this problem, and added this solution, I tested it on collab and my laptop, so I can say It worked and helped me.
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
@dniku ^^
Thanks!
Hey! I just thought this change might be helpful! Just for a better understanding of what time I can stop my long learning and be on the same scale as the graph and
final_score
variable.Because right now, for some reason, we're doing the following:
Suggestion:
What do you think?
Also, I faced with this problem, and added this solution, I tested it on collab and my laptop, so I can say It worked and helped me.