Farama-Foundation / miniwob-plusplus

MiniWoB++: a web interaction benchmark for reinforcement learning
https://miniwob.farama.org/
MIT License
280 stars 47 forks source link

Reward processors #73

Closed ppasupat closed 1 year ago

ppasupat commented 1 year ago

Description

The rewards from MiniWoB environments have a time penalty and partial rewards, which might not always be desirable. A reward processor can be used to specify the type of reward to use. For example, get_binary_reward ignores the time penalty and partial rewards, thus yielding the pure task success rate.

Type of change

Please delete options that are not relevant.

Checklist: