Our training consists of two stages, and each stage utilized approximately more than 200 A800 GPUs, training for 24*5 hours; based on our pre-trained model, performing fine-tuning with LoRA only requires an A10 or a GPU with equivalent memory performance for 2 hours to achieve satisfactory results.
Our training consists of two stages, and each stage utilized approximately more than 200 A800 GPUs, training for 24*5 hours; based on our pre-trained model, performing fine-tuning with LoRA only requires an A10 or a GPU with equivalent memory performance for 2 hours to achieve satisfactory results.