ylsung / vl-merging

PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"
36 stars 0 forks source link

Question about model merging #1

Closed ZihaoZheng98 closed 1 year ago

ZihaoZheng98 commented 1 year ago

Hi, model merging is a new term for me. Model merging is a technique, which can achieve multi-task objective while do not need to conduct multi-task training. Is that correct?

ZihaoZheng98 commented 1 year ago

what if there are two model with different input/output form. Does this technique still work?

ylsung commented 1 year ago

Regarding the first question: yes, but this technique only works this way when you have a strong pre-trained model.

Regarding the second question: in my work, I merge vision and language architectures. They have different embedding layers so I didn't merge them. I only merge the transformer layers because their configuration is the same. I found in this case I need to do fine-tuning or the performance will be bad.

ZihaoZheng98 commented 1 year ago

Thanks. This work brings new insights to me.

ylsung commented 1 year ago

No problem!