zaiweizhang / H3DNet

MIT License
211 stars 25 forks source link

About training time and hardware #1

Closed WangZhouTao closed 4 years ago

WangZhouTao commented 4 years ago

Hi~ My machine is in-built 7700K CPU and a 1080ti GPU, it took 15 mins for one epoch training. It took more than 90 hours to complete training. Can you tell me about the training time and the GPU of your machine?

zaiweizhang commented 4 years ago

Hi, thank you for using our code. I am using a V100 GPU, and using an Intel(R) Xeon(R) CPU E5-2698. It usually takes me about 4~5 minutes for one epoch training for batch size 8. Takes about 16 hours to complete training or model converging (for scannet). During training, I believe there is a matching step in the code taking lots of cpu resources. The training does take some time. I usually train my model in a GPU server. Would you mind telling me what's the batch size you are using? For smaller GPUs, I usually use multi GPUs to maintain a batch size 8. With smaller batch size, the training takes longer time.

WangZhouTao commented 4 years ago

Hi, thank you for your reply. The batch size of my machine is set to 2 (single 1080ti GPU).I will try to run this code on another machine, thank you.

zaiweizhang commented 4 years ago

No problem. I am closing this thread for now. Feel free to re-open it.

WangZhouTao commented 3 years ago

Hi~ My machine is in-built 7700K CPU and a 1080ti GPU, it took 15 mins for one epoch training. It took more than 90 hours to complete training. Can you tell me about the training time and the GPU of your machine?

Hi, thank you for using our code. I am using a V100 GPU, and using an Intel(R) Xeon(R) CPU E5-2698. It usually takes me about 4~5 minutes for one epoch training for batch size 8. Takes about 16 hours to complete training or model converging (for scannet). During training, I believe there is a matching step in the code taking lots of cpu resources. The training does take some time. I usually train my model in a GPU server. Would you mind telling me what's the batch size you are using? For smaller GPUs, I usually use multi GPUs to maintain a batch size 8. With smaller batch size, the training takes longer time.

sorry, I reconfirmed the training time. H3dnet takes about 8 minutes to train an epoch on my machine. Correct here.