Closed ejunprung closed 2 years ago
Results from running manually one-by-one so far.
is this with all models?
Yeah. We think it's using the same seed for each episode. I think we need to reset the seed after each MC iteration.
@ejunprung I found what was the issue. When looping through episodes, done
wasn't reseting to False, also self.reset()
needs to be called too.
Now it is working fine but taking quite long time to run. I think it is slow because it is writing to the csv file at the end of each episode. Maybe if we write the results once at the end of MC it would be faster.
If you run MC, the final reward is the same for each episode.
But if I run the simulation manually one by one, the final rewards are different each time. I'm not sure why, is it related to seeds?