Closed Cheung-Z closed 10 months ago
oh, i solved the error by adding a line of code. if len(tokenizer) > embedding_size: model.resize_token_embeddings(len(tokenizer)) (+)model.config.vocab_size = len(tokenizer)
Thank you so much for reporting the issue! Hm I though the resize_token_embeddings
function will automatically update the model config but I might be wrong. Glad you found out the fix!
Hi, @AkariAsai thx for opening source. I ran the ft script based on Llama-2-7b-chat-hf and 8*A800 GPUs, I only modified the training params and did not change the training code,but i've got an unexpected error.
Here is the FT script:
training.jsonl is download from https://huggingface.co/datasets/selfrag/selfrag_train_data/tree/main