Closed btx0424 closed 2 months ago
Hi,
Thanks for raising this issue.
I have to double check how to correctly load the policy since it has been a while. But can you first check this: I think for opening laptop, there will be two substeps, where the first is a motion planning substep that moves the gripper towards the laptop lib, and the second step is to use RL to open the laptop. Therefore, for the trained RL policy to work, the environment should be initialized to the state where the gripper is already attached the laptop surface, since this is the initial state where the RL policy is trained on. When building up the environment, are you setting last_restore_state_file
to be the path of the state file that stores the last step of the motion planning substep?
Yes. I basically modified the reward
step of execute.py
to load from a policy path instead of training a new policy. So the policy execution starts from the last state of the motion planning step.
Hi,
Sorry for the delay in the response. I have added a script for you to load a pretrained RL policy: https://github.com/Genesis-Embodied-AI/RoboGen/blob/main/run_policy.py. On my side it can correctly load a pretrained policy and reproduce the behavior stored in execute.py.
The checkpoint path should be .../best_model/checkpoint_**/checkpoint-**
.
Let me know if you cannot reproduce the behavior or have any more issues.
Hi there. Thanks for building this seminal and exciting project.
I encountered some problems when trying to play with a trained policy. For example, after I have run
execute.py
for the taskOpen_laptop
, the resulted directory is like:Now that I want to play with the policy:
and the question is what should
policy_path
be? I have tried.../best_model/
,.../checkpoint_**
and.../checkpoint_**/checkpoint-**
and all of them throw no exceptions. However, the behavior of the loaded policy looks pretty random and is far away from that in theexecute.gif
produced during training. Is this expected?Thanks in advance and looking forward to your response.