Hello, during the training process, what is the typical value of the loss? Does it converge quickly? Currently, I'm facing a situation where it doesn't converge.

yichen-byte / medical-chatbot

基于ChatGLM3基座模型和LLAMA-Factory框架进行微调的一个中医问答机器人

70 stars 14 forks source link

Hello, during the training process, what is the typical value of the loss? Does it converge quickly? Currently, I'm facing a situation where it doesn't converge. #1

Open dgo2dance opened 8 months ago

dgo2dance commented 8 months ago

Hello, during the training process, what is the typical value of the loss? Does it converge quickly? Currently, I'm facing a situation where it doesn't converge.