Investigate Potential Use of Pretrained Models in Generalist /Massively Multimodal Setting

ManifoldRG / NEKO

In Progress Implementation of GATO style Generalist Multimodal model capable of image, text, RL and Robotics tasks

https://discord.gg/brsPnzNd8h

GNU General Public License v3.0

43 stars 10 forks source link

Investigate Potential Use of Pretrained Models in Generalist /Massively Multimodal Setting #47

Open harshsikka opened 1 year ago

harshsikka commented 1 year ago

Rather than training from scratch, we might use pretrained weights to serve as a basis for our model.

We want to understand models might serve as the basis for our multimodal model? • first place to start might be with other multimodal papers. • beyond this, the llm literature could also be useful.

Outcome: writeup/analysis on the above

BobakBagheri commented 10 months ago

Update: Want to leverage massive LLMs, current state of the art and fine tune for our purpose Issue needs update from harsh and moved to in-progress from backlog (assigning task to Harsh)

@harshsikka, please provide an estimated time needed to complete writeup for the above need.

harshsikka commented 9 months ago

~ 1 week I think. It really is meant to be an investigation of state of the art pretrained models we might use as a basis for generalist fine tuning going forward.