facebookresearch / ReAgent

A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
https://reagent.ai
BSD 3-Clause "New" or "Revised" License
3.56k stars 516 forks source link

CPE in run_gym #206

Closed sdegeusmob closed 4 years ago

sdegeusmob commented 4 years ago

Is it also possible to do off-policy evaluation (CPE) for train_gym_offline_rl in ml/rl/test/gym/run_gym.py?

czxttkl commented 4 years ago

Adding CPE in run_gym.py would need to adapt replay_buffer to the format needed by CPE. It is possible to add CPE in run_gym.py but since we've already supported such a feature in batch training workflows (e.g., dqn_workflow.py) I don't think we'll prioritize this feature request.

sdegeusmob commented 4 years ago

Okay!

CaralHsi commented 4 years ago

Hi, could you please give an instruction on how to use CPE in dqn_workflow? I only found the GridWorld example, which is with discrete state space, but I want to try to use CPE in continuous space. THX. :) Plus, what is the relationship between CPE and OPE? I am very confused of these two similar files.