Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.
预训练loss 到1.4左右,step到900多 就基本不动了,然后还一直训练?是我的显卡问题么?