kyegomez / PALM-E

Implementation of "PaLM-E: An Embodied Multimodal Language Model"
https://discord.gg/GYbXvDGevY
Apache License 2.0
272 stars 45 forks source link

transfer for caption #15

Open Vincentlu1412 opened 2 months ago

Vincentlu1412 commented 2 months ago

Thanks for your work!! As shown in example.py, caption is under tensor format. So, do I need to create my own transformer-like model to transform a text format caption into a tensor format?

Upvote & Fund

Fund with Polar

kyegomez commented 2 months ago

@Vincentlu1412 no you need to tokenize the text and then pass it to the model. Look up karpathy's video

Vincentlu1412 commented 2 months ago

@Vincentlu1412 no you need to tokenize the text and then pass it to the model. Look up karpathy's video

@kyegomez amazing!! Do you mean video from this guy "https://github.com/karpathy/karpathy.github.io", I just wonder if there is an URL?