Randomness for policy in evaluation

laboroai / border

A reinforcement learning library in Rust

Apache License 2.0

42 stars 8 forks source link

Open taku-y opened 3 years ago

taku-y commented 3 years ago

Following the paper of DQN Atari, epsilon greedy policy with e=0.05 should be used in evaluation (Evaluation procedure, METHODS).