Closed ze00ro closed 1 year ago
I think this is not a problem to this project codes, thanks to this project, but I don't know where the problem is.
below is some params of Lora & Trainer
MICRO_BATCH_SIZE = 4
BATCH_SIZE = 128
GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
EPOCHS = 7 # paper uses 3
LEARNING_RATE = 2e-5 # from the original paper
CUTOFF_LEN = 256 # 256 accounts for about 96% of the data
LORA_R = 4
LORA_ALPHA = 16
LORA_DROPOUT = 0.05
i met the same issue...no solutions found yet...
Someone asked this question before, if you train 20 eporchs, you may get what you want, but the model would be completely broken with overfitting. If you only want to train 10 instances, just feed these prompts in few-shot learning.
Yeah, 10 items is not a training dataset. Way too few tokens to move the lora weights, unless you train for long enough and then break the model.
@lywinged @AngainorDev @nkjulia 请问下各位大佬,如果想要finetune一个模型,让它更关注于某个领域的话,该怎么调整训练方案比较合适?
@lywinged @AngainorDev @nkjulia 请问下各位大佬,如果想要finetune一个模型,让它更关注于某个领域的话,该怎么调整训练方案比较合适?
理论上拿领域内的数据进行finetune就好了,基于llama我这边还没调通,训练了几个版本都不符合预期
below is some params of Lora & Trainer
MICRO_BATCH_SIZE = 4 BATCH_SIZE = 128 GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE EPOCHS = 7 # paper uses 3 LEARNING_RATE = 2e-5 # from the original paper CUTOFF_LEN = 256 # 256 accounts for about 96% of the data LORA_R = 4 LORA_ALPHA = 16 LORA_DROPOUT = 0.05
I just trained an overfitting version of 7B with 50 eporchs in 3 mins, it can answer the question about "Peter Hahaha trained me.", maybe your issue is the same with @nkjulia. Check #293 and reinstall your PEFT.
@lywinged @AngainorDev @nkjulia 请问下各位大佬,如果想要finetune一个模型,让它更关注于某个领域的话,该怎么调整训练方案比较合适?
就这么传统练,不用lora效果更好,最重要是你数据清理和分布。
below is some params of Lora & Trainer
MICRO_BATCH_SIZE = 4 BATCH_SIZE = 128 GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE EPOCHS = 7 # paper uses 3 LEARNING_RATE = 2e-5 # from the original paper CUTOFF_LEN = 256 # 256 accounts for about 96% of the data LORA_R = 4 LORA_ALPHA = 16 LORA_DROPOUT = 0.05
I just trained an overfitting version of 7B with 50 eporchs in 3 mins, it can answer the question about "Peter Hahaha trained me.", maybe your issue is the same with @nkjulia. Check #293 and reinstall your PEFT.
Thank you very much! It worked. but same as you said before, it broke the model perhaps, I will search for this new problem.
@ze00ro Could you please share the final working code brother ? :pray:
@RG-sw default params, but epoch to 10 and learning rate to 1e-3, worked, but I don't think it's a good solution, as above said.
Check my test.py in PR, there is an example to train a 10-line dataset and can observe the changing of the text generation performance as well.
@lywinged @AngainorDev @nkjulia 请问下各位大佬,如果想要finetune一个模型,让它更关注于某个领域的话,该怎么调整训练方案比较合适?
理论上拿领域内的数据进行finetune就好了,基于llama我这边还没调通,训练了几个版本都不符合预期
Hi @nkjulia, 我也在做基于LLaMA+LORA+领域数据的微调,可以一起交流一下吗?谢谢~ Wechat: Oliver_whsun
@lywinged @AngainorDev @nkjulia 请问下各位大佬,如果想要finetune一个模型,让它更关注于某个领域的话,该怎么调整训练方案比较合适?
理论上拿领域内的数据进行finetune就好了,基于llama我这边还没调通,训练了几个版本都不符合预期
I'm fine-tuning LLAMA-2 with LoRA too, unable to get the expected satisfactory performance with 500 examples...
I'm a newbie in AI,
I finetune the llama 7B model, with my custom dataset, because of the train time, I start up with a very litter dataset, the json looks like below:
for just 10 question and answers, very fast trained on colab, for 3 epoch, but the answer is not related to my datasets. I don't know why.
Thanks