Lab 2 - Hard (False major bug and minor bug)

False major bug

As promised during the sync session I was going to share a bug I found in the Hard version of Lab 2. I was having issues during execution of the code when the first Experience Replay happened, regarding an illegal input into the numpy.stack operation due to inconsistent matrix dimensions.

I thought the issue was in the original code, i.e. in the segment

        new_observations, reward, done = env.step(action)

        new_observation = observation[:, :, 1:]
        new_observation = np.concatenate([new_observation, new_observations], axis=2)
        total_steps += 1

And it was due to incorrect concatenation, i.e. new_observations was being used instead of new_observation, but looking into the original notebook again it seems this could have been due to my own modification of the notebook so I apologise for raising concerns.

Minor bug

However, a minor bug that may be worth mentioning involves the saving of the model.

   # Periodically save the model and print statistics
    if episode % 1000 == 0 and episode != 0:
        saver.save(sess, model_path+'/model-'+str(i)+'.cptk')
        print("Saved Model")
    if episode % 10 == 0 and episode != 0:
        print ("Mean Reward: {}".format(np.mean(episode_rewards[-10:])))

In this code segment I recommend switching out the 1000 numerical value in if episode%1000 for num_episodes as you still want to save your outputs when changing the number of episodes, but more importantly, since the main for loop is running on range(num_episodes) it will run from 0 to num_episodes-1, e.g. 0 to 999 meaning if episode % 1000==0 would not be satisfied so the saving operation should instead be performed:

    if episode % (num_episodes - 1) == 0 and episode != 0:

cwfparsonson / AMLS_II

Lab 2 - Hard (False major bug and minor bug) #7

False major bug

Minor bug