MC output seems strange

PathmindAI / pathmind-api

2 stars 1 forks source link

MC output seems strange #36

Closed ejunprung closed 2 years ago

ejunprung commented 2 years ago

If you run MC, the final reward is the same for each episode.

But if I run the simulation manually one by one, the final rewards are different each time. I'm not sure why, is it related to seeds?

ejunprung commented 2 years ago

Results from running manually one-by-one so far.

slinlee commented 2 years ago

is this with all models?

ejunprung commented 2 years ago

Yeah. We think it's using the same seed for each episode. I think we need to reset the seed after each MC iteration.

SaharEs commented 2 years ago

@ejunprung I found what was the issue. When looping through episodes, done wasn't reseting to False, also self.reset() needs to be called too.

Now it is working fine but taking quite long time to run. I think it is slow because it is writing to the csv file at the end of each episode. Maybe if we write the results once at the end of MC it would be faster.