octo-models / octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
https://octo-models.github.io/
MIT License
885 stars 166 forks source link

`absolute_action_mask` removed #100

Open andrearosasco opened 5 months ago

andrearosasco commented 5 months ago

Hello, I noticed that the new commits removed the absolute_action_mask argument. I was wondering how is action padding being managed now?

Thanks

zwbx commented 5 months ago

Hey, I encountered the same problem, have you successfully ran any scripts of the new committed version

Anatr1 commented 5 months ago

I was wandering the same. Any update?

raffaello-camoriano commented 5 months ago

We are also encountering blocking issues with this. Any additional detail regarding action padding would be really helpful. Thank you.

HomerW commented 5 months ago

Hi, sorry the late reply! Previously, the absolute_action_mask was used to figure out what to do with actions in a chunk that go past the point where the goal was achieved (either zero-ing them out or duplicating the last action so that the policy is trained to remain at the goal). Now, instead of using absolute_action_mask and trying to create neutral actions, the dataloader simply indicates when the goal (or the end of the trajectory, in case of no goals) has been reached using the key task_completed. It also updates action_pad_mask to indicate that any actions past the end of the goal should be considered padding.