I created a manual agent (action = 18, keep running forward) that sometimes passes a floor or two using run.py as a base, however the episode reward always shows 0.0.
Changing line 11
obs, reward, done, info = env.step(action)
to
obs, rew, done, info = env.step(action)
reward += rew
I created a manual agent (action = 18, keep running forward) that sometimes passes a floor or two using run.py as a base, however the episode reward always shows 0.0. Changing line 11
to
fixes the issue and shows the rewards correctly.