kyegomez / RT-2

Democratization of RT-2 "RT-2: New model translates vision and language into action"
https://discord.gg/qUtxnK2NMf
MIT License
367 stars 50 forks source link

What tokenizer and embedding do I need? #22

Open andylucny opened 6 months ago

andylucny commented 6 months ago

How can I transform a textual caption into the input of this model?

Upvote & Fund

Fund with Polar