Closed thomwolf closed 3 months ago
Just a quick note on gym_real_world
. At pollen robotics, we tried to add a new task id input and:
We had to make significant change within the gym environment to make it work. Raising the question: can our current gym setup be extensible for user to add feature?
It was also actually very hard to debug. For example we forgot to change the observation space and instead of having an issue within the step method, we simply did not find the input task id
within the observations. and so raising the question: is gym env easily debuggable?
It's hard to make a small model testing script. The intertwining of the model and the environment makes it super hard to swap model between environments or use the model in a context where you're not creating an environment. Ex: a 100LoC script of the model and a for loop.
I think that the example API should be in the likes of:
policy = make_policy(hydra_cfg, pretrained_policy_name_or_path)
while True:
observation = {
"image": cv2.VideoCapture(0),
"qpos": dynamixel.read([0, 1, 2, 3, 4, 5, 6]]
}
observation = preprocess_observation(observation)
action = policy.select_action(observation)
time.sleep(1/fps - inference_time)
dynamixel.write(action, [0, 1, ,2 ,3, 4, 5, 6])
I think that we should let the user handle defining gym_env and let him manage it if he wants to do, but should probably not be the default way of using lerobot,
So basically, i'm opening the discussion as can and should we remove the gym environment at inference time for the real world
Thank @haixuanTao, maybe more a comment for https://github.com/huggingface/lerobot/pull/246?
This PR is really just a deep dive in the algorithmic differences between our implementation of ACT and the original, in particular on the model side. I'll probably close this PR and open a series of smaller ones updating some of these differences.
[DRAFT WIP] Work in progress: deep dive in differences between our ACT implementation for real world data and the implementation from https://github.com/thomwolf/ACT
The goal is to see if we find some room for improvements in our ACT short data trainings (reducing jitter in the same conditions as the original ACT code)
Currently listed differences:
This change is