As promised during the sync session I was going to share a bug I found in the Hard version of Lab 2.
I was having issues during execution of the code when the first Experience Replay happened, regarding an illegal input into the numpy.stack operation due to inconsistent matrix dimensions.
I thought the issue was in the original code, i.e. in the segment
And it was due to incorrect concatenation, i.e. new_observations was being used instead of new_observation, but looking into the original notebook again it seems this could have been due to my own modification of the notebook so I apologise for raising concerns.
Minor bug
However, a minor bug that may be worth mentioning involves the saving of the model.
# Periodically save the model and print statistics
if episode % 1000 == 0 and episode != 0:
saver.save(sess, model_path+'/model-'+str(i)+'.cptk')
print("Saved Model")
if episode % 10 == 0 and episode != 0:
print ("Mean Reward: {}".format(np.mean(episode_rewards[-10:])))
In this code segment I recommend switching out the 1000 numerical value in if episode%1000 for num_episodes as you still want to save your outputs when changing the number of episodes, but more importantly, since the main for loop is running on range(num_episodes) it will run from 0 to num_episodes-1, e.g. 0 to 999 meaning if episode % 1000==0 would not be satisfied so the saving operation should instead be performed:
if episode % (num_episodes - 1) == 0 and episode != 0:
False major bug
As promised during the sync session I was going to share a bug I found in the Hard version of Lab 2. I was having issues during execution of the code when the first Experience Replay happened, regarding an illegal input into the
numpy.stack
operation due to inconsistent matrix dimensions.I thought the issue was in the original code, i.e. in the segment
And it was due to incorrect concatenation, i.e.
new_observations
was being used instead ofnew_observation
, but looking into the original notebook again it seems this could have been due to my own modification of the notebook so I apologise for raising concerns.Minor bug
However, a minor bug that may be worth mentioning involves the saving of the model.
In this code segment I recommend switching out the
1000
numerical value inif episode%1000
fornum_episodes
as you still want to save your outputs when changing the number of episodes, but more importantly, since the mainfor
loop is running onrange(num_episodes)
it will run from0 to num_episodes-1
, e.g.0 to 999
meaningif episode % 1000==0
would not be satisfied so the saving operation should instead be performed: