Aleph-Alpha / magma

MAGMA - a GPT-style multimodal model that can understand any combination of images and language. NOTE: The freely available model from this repo is only a demo. For the latest multimodal and multilingual models from Aleph Alpha check out our website
MIT License
469 stars 55 forks source link

More model structure details expecially parameter numbers of the adapters #49

Open Stephan-XUE opened 4 months ago

Stephan-XUE commented 4 months ago

Hello, first, thanks for your work on the multimodal model using adapter-based fine-tuning, which inspires me greatly. I would like to know more details about the model structure, especially the parameter numbers of the adapters, for which you only give a ratio to the default setting with sequential FF adapters of downsample factor 4 in the ablation study.