XuyangBai / D3Feat

[TensorFlow] Official implementation of CVPR'20 oral paper - D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features https://arxiv.org/abs/2003.03164
MIT License
259 stars 38 forks source link

How to use multi GPU? #5

Closed HXACA closed 3 years ago

HXACA commented 4 years ago

How can I use multiple GPUs for training?

XuyangBai commented 4 years ago

Hi @HXACA Sorry we do not have the multiple GPU support.

HXACA commented 4 years ago

Thanks for your response.I try to use your default parameters to train the model with 3DMatch dataset,but the train_accuracy is always around 0.02. After 150 epoch,I stop the train and find the result is still very bad.

XuyangBai commented 4 years ago

Hi @HXACA I have just checked my code and find that the last commit causes this error. For the original implementation, I use contrastive loss with the hardest mining as the descriptor loss, and this setting is sensitive to the number of keypoint pairs used for calculating the loss, I suspect the current setting 256 might be too hard for the network to converge and I use 64 to get the result in the paper. You can change the keypts_num to 64 and I am pretty sure the network will work.

Another alternative is CircleLoss proposed in this paper. I find it shows better and fast convergence and has no convergence issue with keypts_num = 256 (which gives slightly better performance, and that's why I changed the keypts_num to 256 but I forget to change the loss type accordingly in that commit.)

I will add some notes for circle loss and double-check the configuration later. For now, you can try change the keypts_num to 64 training_3DMatch.py, or change the loss type to circle loss in KPFCNN_model.py. Sorry about the inconvenience.

Let me know if you still have this problem.

HXACA commented 4 years ago

Thanks for your response,I will try later.

HXACA commented 4 years ago

Sorry to bother, I had a new problem. When I tried the circle loss, after about 30 epoch, the loss suddenly increased to around 2, and the accuracy fell to single digits.. I tried to reduce the learning rate, but this problem still occurs.May I ask if you have encountered this problem?

XuyangBai commented 4 years ago

@HXACA what's your TensorFlow version and Cuda version? It seems very similar to the problem in here.

HXACA commented 4 years ago

I used RTX2080ti with CUDA10 and tf 1.12,I will try CUDA9 later.Thanks for your response.

HXACA commented 4 years ago

Sorry to bother again.I want know the keypts_num for KITTI dataset.Is the same with 3dMatch?I tried 256 keypts num and use circleloss,but not as good as your result.And I tried the default number 1024 but seem can't converge.

XuyangBai commented 4 years ago

@HXACA Hi, Could you try keypts_num = 64 and contrastive loss for KITTI? It seems I forgot to change the default value of keypts_num. I haven't tried circle loss for KITTI.

ZYCheng-coder commented 3 years ago

thanks fro your project @XuyangBai @HXACA I found the same problem, and I've changed keypts_num to 64. When train 14 epochs , loss was closed to, the accuracy rate was close to 0. I created environment by your file ,environment.yml, and RTX 2080ti. could you give some advices about it?