Loss is "nan" while training gpt2

stochasticai / xTuring

Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

https://xturing.stochastic.ai

Apache License 2.0

2.61k stars 207 forks source link

Loss is "nan" while training gpt2 #238

Closed sankethgadadinni closed 1 year ago

sankethgadadinni commented 1 year ago

model = BaseModel.create("gpt2") instruction_dataset = InstructionDataset("/content/alpaca_data") model.finetune(dataset=instruction_dataset)

Am I doing something wrong?

tushar2407 commented 1 year ago

Hi @sankethgadadinni , No you are not doing anything wrong, it's just that may be the loss is so high or so low in your case that it cannot be handled and hence you are getting NAN. It's normal to get that in many cases.

tushar2407 commented 1 year ago

@sankethgadadinni were you able to load the model after fine-tuning? Does the model work after fine-tuning? I can help you with walking through the process, I am hold an all-hands session coming Friday, you are welcome to join!! For more details please head to discord channel: https://discord.gg/xj5j3VJC