philtabor / Deep-Q-Learning-Paper-To-Code

MIT License
342 stars 145 forks source link

Viewing results of gameplay? #7

Closed EXJUSTICE closed 3 years ago

EXJUSTICE commented 4 years ago

Hi Phil,

Thank you for your course, I found it to be the most informative and clear approach to Pytorch OpenAI RL. I would like to view the results of the trained agent in action,

I've been previously using an array to store individual observation and and then using sk-video (See below) to make an .mp4 file out of it, or using the Monitor class.

for i in range(2):
  done = False
  observation = env.reset()

  score = 0
  while not done:
    action = agent.choose_action(observation)
    observation_, reward, done, info = env.step(action)

    img_array.append(observation)

However, as you've essentially wrapped up the environment itself, these approaches are no longer possible,as the agent expects the fully stacked, resized inputs of shape (1,84,84,4). I think it would be very helpful to get a simple way of viewing the performance of the trained agent in action as a supplementary as part of the course, is this possible?

EXJUSTICE commented 4 years ago

Managed to solve this by introducing an array into the preprocessFrame wrapper into itself for evaluation purposes! But an official approach would be welcome!

philtabor commented 4 years ago

Check out lecture #39. I show how to save video of the agent playing the game, and it renders with the original frames (not our grayscale / downsized images).

EXJUSTICE commented 4 years ago

Thank you Phil,

You'll find my adaptation for your code for Doom here https://github.com/EXJUSTICE/Doom_DQN_GC/tree/master/PytorchDQN

Will you be doing a course on policy gradient models? Or is the Actor Critic course sufficient to cover the area in your opinion?

Also, I notice that that in DDQN, you specify that we must calculate S (t+1) through the eval and target networks, in this case S (t+1) is a state action value, correct? , as you do end up calling T.argmax() on it later.