Closed LaBaZh closed 4 months ago
When finetuning pretrained Open-LLaVA-Next on the mixture data, I encoutered inverse loss spike issue. Is such issue caused by the mixture form of data? Is it ok to have such curve during the finetuning?
It is very normal and also occurs in the original llava1.5, likely due to the pure text content, although I have not validated it.
When finetuning pretrained Open-LLaVA-Next on the mixture data, I encoutered inverse loss spike issue. Is such issue caused by the mixture form of data? Is it ok to have such curve during the finetuning?