mbodiai / embodied-agents

Seamlessly integrate state-of-the-art transformer models into robotics stacks
https://mbodi.ai/
Apache License 2.0
112 stars 13 forks source link

Rt1-VIT-bert #14

Open Tilak1114 opened 3 weeks ago

Tilak1114 commented 3 weeks ago

Changed EfficientNet to ViT and Integrated BERT Text Encoder for rt1

Description

This update replaces the existing EfficientNet model with a Vision Transformer (ViT) and integrates a BERT text encoder for rt1. The current implementation includes a dummy sample to test the network functionality. The training part is not included in this update.

Summary:

How Has This Been Tested?

The changes have been verified by running a dummy sample to ensure the network functionality works as expected.

Testing:

Checklist