huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.91k stars 27.21k forks source link

Only Fine-tune the embeddings of the added special tokens #35022

Open ys-zong opened 5 days ago

ys-zong commented 5 days ago

Feature request

Hi, I added some new special tokens to the LLMs (specifically I'm using Qwen2-VL) and then I only want to fine-tune the embedding layers of these added tokens while keeping all other parameters (and the embedding layers for other tokens) frozen. I wonder if there is a built-in way to do so instead of fine-tuning the whole embedding matrix?

Motivation

If we want to maximumly retain the original capabilities of the model while adding new tokens for certain scenarios, this might be needed, especially when we don't have much data and do not want to alter the pretrained weights.

Another question: if we have a considerable amount of data, is it recommended to fine-tune the whole embedding matrix or only the embeddings for the added tokens?

Your contribution

If it's a reasonable feature and not implemented yet, I'm happy to submit a PR.

zucchini-nlp commented 3 days ago

Just fyi, idefics already has a similar layer for freezing some embeddings. Maybe you can change the embedding layer in Qwen2-VL with the below module and thus freeze unwanted token ids when training

https://github.com/huggingface/transformers/blob/19dabe96362803fb0a9ae7073d03533966598b17/src/transformers/models/idefics/modeling_idefics.py#L201-L208