In Progress Implementation of GATO style Generalist Multimodal model capable of image, text, RL and Robotics tasks
GNU General Public License v3.0
43
stars
10
forks
source link
Experiment with various Vision Transformers (ViT) #88
Open
henryj18 opened 4 months ago
Experiment with https://github.com/lucidrains/vit-pytorch and replace the current ImageEmbedding with such ViT to see whether it can improve the NEKO performance