Aleph-Alpha / magma

MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multilingual models from Aleph Alpha check out our website https://app.aleph-alpha.com
MIT License
469 stars 55 forks source link

More model structure details expecially parameter numbers of the adapters #49

Open Stephan-XUE opened 4 months ago

Stephan-XUE commented 4 months ago

Hello, first, thanks for your work on the multimodal model using adapter-based fine-tuning, which inspires me greatly. I would like to know more details about the model structure, especially the parameter numbers of the adapters, for which you only give a ratio to the default setting with sequential FF adapters of downsample factor 4 in the ablation study.