Closed adrigrillo closed 5 years ago
Changes in the dqn agents:
Implementation of double and dueling dqn without prioritized memory. The policies now are:
The exploration rated can be modified easily, now the epsilon_calculator object is an argument of the policy learner.
epsilon_calculator
An example of how to instantiate an agent:
memory_delay = 5000 init_eps = 1.0 memory_eps = 0.8 min_eps = 0.01 eps_decay = 500000 linear = LinearSchedule(schedule_timesteps=memory_delay, initial_p=init_eps, final_p=memory_eps) exponential = ExponentialSchedule(initial_p=memory_eps, min_p=min_eps, decay=eps_decay) policy = PrioritizedDoubleDeepQNetwork(4, env[0].action_space.n, eps_calculator=linear, memory_eps_calculator=exponential, memory_delay=memory_delay)
Changes in the dqn agents:
Implementation of double and dueling dqn without prioritized memory. The policies now are:
The exploration rated can be modified easily, now the
epsilon_calculator
object is an argument of the policy learner.An example of how to instantiate an agent: