ManifoldRG / NEKO

In Progress Implementation of GATO style Generalist Multimodal model capable of image, text, RL and Robotics tasks
https://discord.gg/brsPnzNd8h
GNU General Public License v3.0
38 stars 9 forks source link

Test the codebase with timm vit #91

Open henryj18 opened 1 month ago

henryj18 commented 1 month ago

In an effort to decrease the training loss of VQA task, we are experimenting another ViT, timm ViT, this is a pre-trained ViT. So far the test shows that VQA loss is not improving with this ViT, we need to explore further