kohjingyu / fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
https://jykoh.com/fromage
Apache License 2.0
466 stars 34 forks source link

Freezing the final linear layer when adding new token [RET] #31

Closed ptirupat closed 7 months ago

ptirupat commented 9 months ago

Hello,

Thank you for releasing the code for your paper. It is fascinating work. I have one question specific to the implementation.

When the [RET] token is added, the embedding layer is updated along with the final classification layer. Specifically, the output dimension of the FC layer is updated to 32001. However, you freeze all the layers in LLM. How does this work during training when you have the next token prediction training?

kohjingyu commented 9 months ago

The embedding matrix and the lm_head layer will be unfrozen when the LM token embeddings are resized. More details in https://github.com/kohjingyu/fromage/issues/6#issuecomment-1453650399

Hope that helps!