Trainable parameters during finetuning for medical VQA

microsoft / LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.

Other

1.29k stars 148 forks source link

Open DopamineLcy opened 3 weeks ago

DopamineLcy commented 3 weeks ago

Thank you for your impressive work!

I'm wondering about the image encoder, projection layer, and LM, which are trainable during finetuning for medical VQA.

Looking forward to your reply.

Best,