Closed dyabel closed 1 year ago
Hi, could you please specify what you mean with evaluating a model in an RL manner? Our standard way of evaluation is resetting the robot to a neutral position and then it has to follow a chain of language instructions. Do you want more information on how to run it or do you need another way of evaluation?
@dyabel, we would like to help you if you give us a bit of information!
@dyabel, we would like to help you if you give us a bit of information!
Hi, I want to use the already-trained model to interact with the environment and see the reward. As I see from the code, the model is tested on the offline data.
No, when we run the evaluation, we actually do rollouts in the environment. Check this part of the code.
You can run the evaluation like this:
python hulc/evaluation/evaluate_policy.py --dataset_path <PATH/TO/DATASET> --train_folder <PATH/TO/TRAINING/FOLDER> --checkpoint <PATH/TO/CHECKPOINT>
add --debug to see a live video of the rollout.
In this line we check if a subtask
was completed, you can use this as a binary reward. We are using the word subtask here because the agent has to follow a chain of 5 instructions, but every subtask is a complete task such as "open the drawer".
No, when we run the evaluation, we actually do rollouts in the environment. Check this part of the code.
You can run the evaluation like this:
python hulc/evaluation/evaluate_policy.py --dataset_path <PATH/TO/DATASET> --train_folder <PATH/TO/TRAINING/FOLDER> --checkpoint <PATH/TO/CHECKPOINT>
add --debug to see a live video of the rollout.
In this line we check if a
subtask
was completed, you can use this as a binary reward. We are using the word subtask here because the agent has to follow a chain of 5 instructions, but every subtask is a complete task such as "open the drawer".
Thank you for quick reply! I will try that.
Thank you for your great work! I wonder how to evaluate the trained model in a rl manner. Can you provide an example? Thx.