Experiment with various Vision Transformers (ViT)

ManifoldRG / NEKO

In Progress Implementation of GATO style Generalist Multimodal model capable of image, text, RL and Robotics tasks

https://discord.gg/brsPnzNd8h

GNU General Public License v3.0

43 stars 10 forks source link

Open henryj18 opened 4 months ago

henryj18 commented 4 months ago

Experiment with https://github.com/lucidrains/vit-pytorch and replace the current ImageEmbedding with such ViT to see whether it can improve the NEKO performance