MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multilingual models from Aleph Alpha check out our website https://app.aleph-alpha.com
MIT License
469
stars
55
forks
source link
More model structure details expecially parameter numbers of the adapters #49
Hello, first, thanks for your work on the multimodal model using adapter-based fine-tuning, which inspires me greatly.
I would like to know more details about the model structure, especially the parameter numbers of the adapters, for which you only give a ratio to the default setting with sequential FF adapters of downsample factor 4 in the ablation study.
Hello, first, thanks for your work on the multimodal model using adapter-based fine-tuning, which inspires me greatly. I would like to know more details about the model structure, especially the parameter numbers of the adapters, for which you only give a ratio to the default setting with sequential FF adapters of downsample factor 4 in the ablation study.