octo-models / octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
https://octo-models.github.io/
MIT License
633 stars 112 forks source link

Aloha Finetuning Configuration #42

Open andrearosasco opened 5 months ago

andrearosasco commented 5 months ago

Do you have any insights about the differences in configuration between the pre-trained model and the aloha fine-tuned one?

Screenshot 2024-01-22 143318

In particular I was wondering

seann999 commented 5 months ago

I'm also curious about why multi-head attention is enabled in the L1Head when it seems to be False for the DiffusionHead. Is this also based on the design decisions in the aloha/act paper?

kpertsch commented 1 month ago

Ah sorry I had missed this question when you posted it a while back!

Re multi-head attention: I am assuming your question is regarding the use_map argument? Since we're using a single read-out token for the action head in both cases, the attention pooling shouldn't have much effect, so I'd expect this argument to not matter in practice.