csuhan / OneLLM

[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language
Other
552 stars 27 forks source link

freezing of LLM during pretrain stage #19

Open eugenelet opened 5 months ago

eugenelet commented 5 months ago

Hi, Thanks for the awesome contribution to the community!

There's something that has been bugging me for hours. It is mentioned in the paper that the LLM is frozen during the training of the projection modules. However, I couldn't pin-point the code that's responsible for that from the released code. Is the paper just a guideline or the relevant part hasn't been released yet? Or I could've just missed the relevant code that's responsible for this behavior?

Eugene

csuhan commented 5 months ago

@eugenelet Thanks for pointing out that! The current code is modified from the inference code, so we MISSED some functions such as parameter freezing. We will fix it soon. For now, you can temporally add a few lines to freeze the parameters in https://github.com/csuhan/OneLLM/blob/db0233348217e0a36d336de31c70adfbf9893a29/model/LLM/onellm.py#L332

For example, freezing LLM:

for param in list(self.layers.parameters()) + list(self.tok_embeddings.parameters()) + list(self.norm.parameters()) + list(self.output.parameters()):
            param.requires_grad = False
eugenelet commented 5 months ago

Thanks for point this out @csuhan ! I'll use this workaround for now. Keep on the great work!