Farama-Foundation / Arcade-Learning-Environment

The Arcade Learning Environment (ALE) -- a platform for AI research.
GNU General Public License v2.0
2.14k stars 420 forks source link

Rewards #416

Closed Adrii98 closed 3 years ago

Adrii98 commented 3 years ago

Hi, I have a question, i am trying to pass to the ale the action that gives the biggest reward in a certain moment, therefore i want to know how can i take the reward of a given action without using the function act. My goal i to know the biggest reward action and after that pass that action to the ALE act method. Thanks in advance

JesseFarebro commented 3 years ago

Although this technically is possible using cloneState / restoreState I'm not sure why you'd want to do this. Greedily choosing the action which maximizes the reward on the next time step isn't necessarily optimal. If you really want to do this try playing around with cloneState / restoreState, you should be able to do something like,

currentState = cloneState()
for every action:
    restoreState(currentState)
    execute action
    observe reward

check which action resulted in the maximum reward