episode_rewards in the RL demo

yding5 commented 5 years ago

A quick question. When I'm using the code to do some preliminary RL experiment, I found these in the main.py.

            # FIXME: works only for environments with sparse rewards
            for idx, eps_done in enumerate(done):
                if eps_done:
                    episode_rewards.append(reward[idx])

Does this only used to collect episode reward to show statistics and have nothing to do with the training part?

Does the FIXME means it intends to collect the accumulated reward of the whole episode just ended but currently it assume the reward at last step is the accumulated reward?

Thanks!

maximecb commented 5 years ago

This only affects success rate statistics

yding5 commented 5 years ago

Great, thanks! Close this now.

Farama-Foundation / Miniworld

episode_rewards in the RL demo #17