Open prstrive opened 1 year ago
Yes, we notice that training our model on V100 is very slow, we guess that maybe the sparseconv operations not be optimized well on V100, so we adopt RTX2080Ti for training. We train our model on two RTX2080Ti, for each stage, 100k-150k will be enough. The training will cost 2-3 days. Because Distributed Data-Parallel mode doesn't support the second derivative, we adopt Data Parallel mode to train the model and each GPU handles one sample.
Hello, thanks for your excellent work! I am confused about the training time and inference time recently when I tried to reproduce your model. I tried to train the first stage of SparseNeus in two V100 GPUs, but I find that the traning time seams to be too long and not acceptable. Maybe there are something wrong of my conf. So could you please give some info about the training time and the inference time.
Hi Prstrive, how long is one iteration during training? I am using V100 and see it's around 1iter/second.
Hello, thanks for your excellent work! I am confused about the training time and inference time recently when I tried to reproduce your model. I tried to train the first stage of SparseNeus in two V100 GPUs, but I find that the traning time seams to be too long and not acceptable. Maybe there are something wrong of my conf. So could you please give some info about the training time and the inference time.