Open StrangePineAplle opened 3 weeks ago
Oh you must always call add_new_tokens
BEFORE .get_peft_model
!!! See https://github.com/unslothai/unsloth/wiki#adding-new-tokens
Thanks, but what about changing the embedding size of the model after adding a new token to the tokenizer? Do I need to do it, or does it happen inside add_new_tokens?
yes it's happening inside the add_new_tokens
:D
I know this isn't the best place to ask, but I have one more question about the tokenizer itself. I want to add custom tokens to mark the start, end, and separator between table data that I'm adding to the prompt when fine-tuning the model. I understand that custom tokens are helpful for specific terms, but can they also help the model better understand data structure? I couldn't find any direct answers to that, so I would be endlessly grateful for any information about this and any other use cases for custom tokens.
I think generally it's hard to add new tokens because in the pretraining phase which consume trillion of tokens, the model never seen those tokens. I think if possible, just use like JSON, or add a lot of data so the model can learn from the new token.
Hey @StrangePineAplle I am currently also working on a fine tune using a few new tokens for a classification problem. However, I am getting exploding gradients and loss is just going up. Did you face these same problems? Thanks :)
Hello, thank you for your work first. I'm trying to add a few tokens to fine-tune the model afterward, but I'm facing a few errors. First, I downloaded the model:
Then I added tokens:
But I got an error:
RuntimeError: Setting requires_grad=True on inference tensor outside InferenceMode is not allowed.
Then I initialized the QLoRa model and trained it. If I add tokens to the model with QLoRa:
I will not have this error, but I will get this error in the training function:
RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
I am very confused by this. Can you explain where I should add new tokens or should I use special reserved tokens instead?