uber-research / deep-neuroevolution

Deep Neuroevolution
Other
1.62k stars 298 forks source link

Snapshots visualization #35

Open EmanueleLM opened 4 years ago

EmanueleLM commented 4 years ago

When I try to visualize the snapshots generated by games like Frostbite, i just run the command

python -m scripts.viz 'FrsotbiteNoFrameskip-v0' <snapshot_file>

This works like a charm, but when I change the game, so like SpaceInvaders, after the training and the generation of the snapshots files, when I run:

python -m scripts.viz 'SpaceInvadersNoFrameskip-v0' <snapshot_file>

It doesn't work as it expects inputs of shape (1,84,84,4) but SpaceInvaders' inputs are (1, 210, 160, 3).

Its like the script viz.py has stubbed the expected input shape and can't handle different games from those with (1,84,84,4) tensors.

Did anyone run into the same issue?

Details of the error: Input command:

python3 -m scripts.viz 'SpaceInvadersNoFrameskip-v0' keep_exploring/old_gens/gen_125-153/snapshot_iter00028_rew1100.h5

Error reported:

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/emanuele/Desktop/deepneuro/deep-neuroevolution/scripts/viz.py", line 63, in <module>
    main()
  File "/home/emanuele/.local/lib/python3.7/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/emanuele/.local/lib/python3.7/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/emanuele/.local/lib/python3.7/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/emanuele/.local/lib/python3.7/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/emanuele/Desktop/deepneuro/deep-neuroevolution/scripts/viz.py", line 54, in main
    rews, t, novelty_vector = pi.rollout(env, render=True, random_stream=np.random if stochastic else None)
  File "/home/emanuele/Desktop/deepneuro/deep-neuroevolution/es_distributed/policies.py", line 509, in rollout
    ac = self.act(ob[None], random_stream=random_stream)[0]
  File "/home/emanuele/Desktop/deepneuro/deep-neuroevolution/es_distributed/policies.py", line 485, in act
    return self._act(train_vars)
  File "/home/emanuele/Desktop/deepneuro/deep-neuroevolution/es_distributed/tf_util.py", line 176, in <lambda>
    return lambda *inputs : f(*inputs)[0]
  File "/home/emanuele/Desktop/deepneuro/deep-neuroevolution/es_distributed/tf_util.py", line 191, in __call__
    results = get_session().run(self.outputs_update, feed_dict=feed_dict)[:-1]
  File "/home/emanuele/.local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 950, in run
    run_metadata_ptr)
  File "/home/emanuele/.local/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1149, in _run
    str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 84, 84, 4) for Tensor 'GAAtariPolicy/Placeholder:0', which has shape '(?, 210, 160, 3)'
EmanueleLM commented 4 years ago

I'm almost there to solve the problem, still some issues.

If you modify atari_wrappers.py, class 'WarpFrame' to this version:

class WarpFrame(gym.ObservationWrapper):
    def __init__(self, env, show_warped=False):
        """Warp frames to 84x84 as done in the Nature paper and later work."""
        gym.ObservationWrapper.__init__(self, env)

        if 'SpaceInvaders' in str(env):
            self.res1, self.res2 = 210, 160
        else:
            self.res1, self.res2 = 84, 84
        self.observation_space = spaces.Box(low=0, high=255, shape=(self.res1, self.res2, 1))
        self.viewer = None
        self.show_warped = show_warped

And add this to the first part of the function 'wrap_deepmind':

def wrap_deepmind(env, episode_life=False, skip=4, stack_frames=4, noop_max=30, noops=None, show_warped=False):
    """Configure environment for DeepMind-style Atari.
    Note: this does not include frame stacking!""" 
    if 'SpaceInvaders' in str(env):
        stack_frames = 3

Once this part is done, if you run the viz.py it works, but it doesn't play as better as 'standard' Atari games like Frtosbite: please note that this happens despite the snapshot indicates a higher score than what is then shown by viz.py.

If I modify the first line after the declaration of the function '_observation(self, obs)', that stores a 'stubbed' numpy array of magical floats, the behavior of the net and the scores change dramatically, so I'd like to ask what this array is meant to store. The Github version is the following:

    def _observation(self, obs):
        frame = np.dot(obs.astype('float32'), np.array([0.299, 0.587, 0.114], 'float32'))  # real black magic?