shakenes / vizdoomgym

OpenAI Gym wrapper for ViZDoom enviroments
MIT License
65 stars 27 forks source link

Monitor class not working - how to visualize? #4

Closed EXJUSTICE closed 1 year ago

EXJUSTICE commented 4 years ago

Hi

My model has trained on the vizdoomgym environment, but how do you recommend we visualize gameplay results? I've tried wrapping the environment in the monitor class, but this returns "Could not find video" as an error, as defined in show_video(). Same code works fine for OpenAI's default gyms.

Any help is appreciated!

def show_video():
  mp4list = glob.glob('video/*.mp4')
  if len(mp4list) > 0:
    mp4 = mp4list[0]
    video = io.open(mp4, 'r+b').read()
    encoded = base64.b64encode(video)
    ipythondisplay.display(HTML(data='''<video alt="test" autoplay 
                loop controls style="height: 400px;">
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii'))))
  else: 
    print("Could not find video")

def wrap_env(env):
  env = Monitor(env, './video', force=True)
  return env
environment = wrap_env(env)
done = False
observation = environment.reset()
new_observation = observation

prev_input = None

environment = wrap_env(env)
done = False
observation = environment.reset()
new_observation = observation

prev_input = None
with tf.compat.v1.Session() as sess:
    init.run()
    observation, stacked_frames = stack_frames(stacked_frames, observation, True)

    while True:

        #set input to network to be difference image

        #print(observation.shape)

        # feed the game screen and get the Q values for each action
        actions = mainQ_outputs.eval(feed_dict={X:[observation], in_training_mode:False})

        # get the action
        action = np.argmax(actions, axis=-1)
        actions_counter[str(action)] += 1 

        # select the action using epsilon greedy policy
        action = epsilon_greedy(action, global_step)
        environment.render()
        new_observation, stacked_frames = stack_frames(stacked_frames, new_observation, False)

        observation = new_observation        
        # now perform the action and move to the next state, next_obs, receive reward
        new_observation, reward, done, _ = environment.step(action)

        if done: 

          break

    environment.close()
    show_video()
EXJUSTICE commented 4 years ago

I made a workaround using skvideo, but I still cannot get Monitor to work.