laboroai / border

A reinforcement learning library in Rust
Apache License 2.0
42 stars 8 forks source link

Randomness for policy in evaluation #23

Open taku-y opened 3 years ago

taku-y commented 3 years ago

Following the paper of DQN Atari, epsilon greedy policy with e=0.05 should be used in evaluation (Evaluation procedure, METHODS).