RozDavid / LanguageGroundedSemseg

Implementation for ECCV 2022 paper Language-Grounded Indoor 3D Semantic Segmentation in the Wild
98 stars 14 forks source link

Do you have estimated time for training on Scannet V2 from scratch? #11

Closed 618QRC closed 1 year ago

618QRC commented 1 year ago

Hi we are focusing on the efficiency of 3d models, so we want to know the time performance of this method. BTW, thanks for your talented work, it is fabulous

RozDavid commented 1 year ago

Hey,

Which part do you mean by training from scratch? But generally all trainings will be mostly bottlenecked by the models forward and backward pass, to language grounding loss or the standard finetuning wont objective wont make a big difference. So hard to say, depending on the GPU type, and local bandwidth, but I would say 2 days is a reasonable estimate on a single larger modern GPU.

Cheers, David

618QRC commented 1 year ago

Hey,

Thanks for your reply! So it means training the model for both language grounding pretraining and fine-tuning on a specific task(semantic segmentation) may take 2 days on a single GPU like RXT_3090 or other equal types GPU?

RozDavid commented 1 year ago

Yes and no :) So both take two days, so it takes 4 days altogether (though the grounding phase is task independent), but you don't have to retrain the preatraining phase after for different downstream tasks.

618QRC commented 1 year ago

Hey,

Got it, Thanks for your nice reply.

Have a good day!