Yangyi-Chen / SOLO

Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"
Apache License 2.0
111 stars 3 forks source link

Training Strategy #5

Closed syspider closed 3 months ago

syspider commented 3 months ago

Thank you for the remarkable work!

In all the 3 training stages, are the whole model parameters updated ? Or in stage 1, only the linear projection updated ?

Looking forward to your reply.

Yangyi-Chen commented 3 months ago

Hi! Thanks for your interest in our work. Yes all parameters are fine-tuned in three stages of pre-training and instruction fine-tuning.