Closed Andrew-Zhang closed 4 months ago
Hi!
I encountered a similar situation when training LLaVA, which seems to be a common phenomenon in large (visual) language models. This is not caused by VoCo-LLaMA. This is a possible reasonable explanation I have learned. Reducing the learning rate or using LoRA may alleviate this phenomenon.
Ok thanks!
Hello! I followed the instructions to train VoCO-LLaMA, and I get the following loss curve.
Is it normal to have spikes where the loss drops to around 0.3 roughly every 20 steps? Thanks!