Note: (JJ) read through the line by line diff's to make sure that there was no real change.
Also ran some small test, fixing the replay buffer, fixing the episode length, and got the rewardLogs + lossLogs to see that they were quite similar to ensure nothing broke.
Note: (JJ) read through the line by line diff's to make sure that there was no real change. Also ran some small test, fixing the replay buffer, fixing the episode length, and got the rewardLogs + lossLogs to see that they were quite similar to ensure nothing broke.