tambetm / simple_dqn

Simple deep Q-learning agent.
MIT License
692 stars 184 forks source link

should _restartRandom (src/agent.py) choose random action? #11

Closed mw66 closed 8 years ago

mw66 commented 8 years ago

30: reward = self.env.act(0)

right now all the action is fixed to 1st action

how about random.randint(0, self.num_actions)?

mw66 commented 8 years ago

oh:

random.randint(0, self.num_actions-1)

tambetm commented 8 years ago

DeepMind uses null action in their code and evaluations, so I would stick with this. Other option is "human starts", which was introduced in Gorila paper. You are welcome to submit pull request for this.