About Training Time - Githubissues

opendilab / LMDrive

[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Apache License 2.0

526 stars 48 forks source link

About Training Time #23

Open ReaFly opened 4 months ago

ReaFly commented 4 months ago

Hi authors! Thanks for your excellent work. I encounter low training efficiency issues. The second Instruction finetuning stage takes about 6 days on my 8*A100 (40G) GPUs, utilizing only Town01 data downloaded from openxlab. I noticed you mentioned that Instruction finetuning takes around 3 days for the visual encoder on 8x A100 (80G). If you utilize all (Town01-Town07,Town10) data during the finetuning stage? and what could be the possible reasons on my machine?

deepcs233 commented 4 months ago

Hi! Sorry for the late reply. Maybe you can check the disk speed? It will significantly affect the training efficiency. I recommend moving the training data to the SSD.