yxuansu / PandaGPT

[TLLM'23] PandaGPT: One Model To Instruction-Follow Them All
https://panda-gpt.github.io/
Apache License 2.0
768 stars 60 forks source link

Training Stage #1

Open gordonhu608 opened 1 year ago

gordonhu608 commented 1 year ago

Thank you for your excellent work. Since there's no mention of the training stage, does that mean 160k data is trained on LLM using LoRA and llama projection layer at the same time? If that is the case, do I expect the save_to_modules: [llama projection] or you just simply turn the projection layer require gradients as true?

yxuansu commented 1 year ago

Hi @gordonhu608, thanks for your interest in our work! Yes, we jointly train the LoRA weights and project layer during the training process. To do so, we simply allow the gradients to flow back to the parameters of the projection layer. Please see the related code here.