why use the output of the first decoder layers in ACT model？

MarkFzp / act-plus-plus

Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN

https://mobile-aloha.github.io/

MIT License

2.98k stars 551 forks source link

why use the output of the first decoder layers in ACT model？ #26

Open junhui1997 opened 8 months ago

junhui1997 commented 8 months ago

hs = self.transformer(src, None, self.query_embed.weight, pos, latent_input, proprio_input, self.additional_pos_embed.weight)[0], In the ACT model， should this index be -1？