openai / coinrun

Code for the paper "Quantifying Transfer in Reinforcement Learning"
https://blog.openai.com/quantifying-generalization-in-reinforcement-learning/
MIT License
388 stars 87 forks source link

fix accessing potential 0 elements array when use_aux is False #27

Closed Kelvinson closed 5 years ago

Kelvinson commented 5 years ago

when game is over, game_over_rew can access long_aux_rewards which can be potentially in shape zero at axis 1 when use_aux is False. We can first look at whether use_aux is True and then decide to access long_aux_rewards or set game_over_rew to simply zero.

christopherhesse commented 5 years ago

This seems likely to be correct, but when is this used? Is ale.lives ever set?

Kelvinson commented 5 years ago

I don't know about the coinenv, but for the normal gym environment where there is no aux_rew, at the end of the game, self.long_aux_rewards[i,0] will be accessed although self.long_aux_rewards is in shape zero created at [https://github.com/Kelvinson/coinrun/blob/8ad3339286e486f4258761a7311814b7e8d00665/coinrun/wrappers.py#L54]()

Kelvinson commented 5 years ago

NVM, maybe it's not a bug for coinenv.