yxuansu / PandaGPT

[TLLM'23] PandaGPT: One Model To Instruction-Follow Them All
https://panda-gpt.github.io/
Apache License 2.0
768 stars 60 forks source link

Inquiry about using LoRA #2

Closed pipilurj closed 1 year ago

pipilurj commented 1 year ago

Dear authors, Hello! This is a very interesting work! Just out of curiosity, I wonder whether you have tried training only the linear layer to align the modalities rather than using Lora, as in MiniGPT4 and DetGPT, does it still work? I suppose there may be some difficulties during alignment in this way, since there isn't an off-the-shelf Q-Former that can be used with ImageBind features. Thank you very much!

yxuansu commented 1 year ago

Hi @pipilurj, thank you for your question. We also tried to only train a linear layer to align the features as MiniGPT-4 does. And the model is also able to produce reasonable results. So we assume that the off-the-shelf Q-Former may not be that important. If you only have limited computation resources, I suggest you can just train a new linear layer and let keep the other parameters frozen.

pipilurj commented 1 year ago

Thats great, Thanks!