If new tokens are introduced and must be trained, resizing the embedding matrix is needed. This is supported for HF checkpoints but not for Modalities checkpoints.
Motivation
Extend pre-trained models with new tokens, which can be trained to fulfill special purposes, e.g., tool call start token, assistant end generation token, etc.
Feature request
If new tokens are introduced and must be trained, resizing the embedding matrix is needed. This is supported for HF checkpoints but not for Modalities checkpoints.
Motivation
Extend pre-trained models with new tokens, which can be trained to fulfill special purposes, e.g., tool call start token, assistant end generation token, etc.