Closed sdegeusmob closed 4 years ago
Adding CPE in run_gym.py would need to adapt replay_buffer to the format needed by CPE. It is possible to add CPE in run_gym.py but since we've already supported such a feature in batch training workflows (e.g., dqn_workflow.py) I don't think we'll prioritize this feature request.
Okay!
Hi, could you please give an instruction on how to use CPE in dqn_workflow? I only found the GridWorld example, which is with discrete state space, but I want to try to use CPE in continuous space. THX. :) Plus, what is the relationship between CPE and OPE? I am very confused of these two similar files.
Is it also possible to do off-policy evaluation (CPE) for train_gym_offline_rl in ml/rl/test/gym/run_gym.py?