Open Erland366 opened 3 weeks ago
I need a discussion about the embedding tho since I did not implement specification to specify the method to extend the embedding. So for example, when training the embedding, the user specify to use interpolation
. Then when we load the checkpoint and resize the base model again, we need to make sure that the resize method is the same as in training.
Maybe we can store additional params in the model.config
of the method? then we can pass it when we load the checkpoint and resize?
Also while here, seems like the value of tokenizer.vocab_size
is unchanged when we do add_new_tokens
. Is tokenizer.vocab_size
only consider non special tokens and since we add all of the new tokens to the special tokens, that's why the attribute value is not increasing?
https://colab.research.google.com/drive/1xBxY_L48Lzu5SJjukPExgoWVthoyTGCA?usp=sharing
reproducible of this fix
https://github.com/unslothai/unsloth/issues/1215
Given this issue where we can't immediately use the changed vocab size because the difference size between the adapter and base model, we need to resize the base model before merging the LoRA into base model.
Note this need changes to the
unsloth-zoo
since we need a modification of it. which I also create a PR of ithttps://github.com/unslothai/unsloth-zoo/pull/9