Open Andy1621 opened 4 months ago
Yes, they should be updated. There should be a code for the old PEFT version, where PEFT will have two input embeddings (copied one and original one), and one of them (original one) does not participate in the gradient calculation. If we set both requiring gradients, the DDP will find unused parameters. These two lines are used to avoid this situation. (if I remember correctly)
Hi! Thanks for your interesting job!
I just find that when using LoRA, the
input_embeddings
andout_embeddings
are not updated with the following code.https://github.com/eric-ai-lab/MiniGPT-5/blob/2121c745b2cb2d7e842e03b4bcaa89c63f2ee6c1/minigpt4/models/mini_gpt5.py#L115-L116
Considering the LoRA is used for Stage2, does it mean the
input_embeddings
andout_embeddings
are only updated in Stage1? If so, the two lines are redundant since PEFT will set them not updated.