Claud1234 / CLFT

This is the repository for FCN and Transformer based object segmentation that relies on the fusion of camera and LiDAR data.
7 stars 2 forks source link

Could you please tell me the time it took for training? #2

Open palpitatingaaa opened 4 weeks ago

palpitatingaaa commented 4 weeks ago

Thank you for your excellent work. I would like to know if multiple GPUs can be used for training at the same time? Training on one 4090 takes about 30 hours.

AirPlanBird commented 3 weeks ago

感谢您的出色工作。我想知道是否可以同时使用多个 GPU 进行训练?在一台 4090 上进行训练大约需要 30 小时。

你好,我想询问一下,你的训练结果可以达到作者显示的精度吗?我的训练IoU结果不到0.9

palpitatingaaa commented 3 weeks ago

感谢您的出色工作。我想知道是否可以同时使用多个 GPU 进行训练?在一台 4090 上进行训练大约需要 30 小时。

你好,我想询问一下,你的训练结果可以达到作者显示的精度吗?我的训练IoU结果不到0.9

你好,我还没有完全训练完,可以加一下qq1007741604一起交流一下

Claud1234 commented 3 weeks ago

@palpitatingaaa Hi. Sorry for the late reply. I went for vacation in the last two weeks. Unfortunately, so far there is no function for utilizing multiple GPU in parallel, because I don't have multiple GPU setup when I was working on it. About the training time, it is reasonable, I was using A100 GPU, which also take more than one day to finish the training. I recommend you do some hyperparamer tuning to early stop the training. In our latest experiments, The IoU performs the best usually around 200 epochs.

@AirPlanBird There must be something wrong if your IoU is this low for the dataset i provided. We are working on it now for more classes, nothing is lower than 50%, the corresponding code is in branch 'conference_paper_training'. What I can advise now is first checking the branch. I am still cleaning the code, so please check the branch 'cleaning', this branch only contains the 'vehicle' and 'human' two classes, as in our paper. Also please check the function in https://github.com/Claud1234/CLFT/blob/885e58f6f49262c5015bf16fad60e6c21012c709/tools/trainer.py#L168 to make sure these basic IoU calculation is correct.

At last, @palpitatingaaa I am happy to provide the necessary help, but I do not have QQ at the moment, and I need to take a look how to use it on my Linux computer. You could find me directly on Skype, you can try this link. https://join.skype.com/invite/hign90IeLxg0

palpitatingaaa commented 3 weeks ago

@palpitatingaaa Hi. Sorry for the late reply. I went for vacation in the last two weeks. Unfortunately, so far there is no function for utilizing multiple GPU in parallel, because I don't have multiple GPU setup when I was working on it. About the training time, it is reasonable, I was using A100 GPU, which also take more than one day to finish the training. I recommend you do some hyperparamer tuning to early stop the training. In our latest experiments, The IoU performs the best usually around 200 epochs.

@AirPlanBird There must be something wrong if your IoU is this low for the dataset i provided. We are working on it now for more classes, nothing is lower than 50%, the corresponding code is in branch 'conference_paper_training'. What I can advise now is first checking the branch. I am still cleaning the code, so please check the branch 'cleaning', this branch only contains the 'vehicle' and 'human' two classes, as in our paper. Also please check the function in

https://github.com/Claud1234/CLFT/blob/885e58f6f49262c5015bf16fad60e6c21012c709/tools/trainer.py#L168

to make sure these basic IoU calculation is correct. At last, @palpitatingaaa I am happy to provide the necessary help, but I do not have QQ at the moment, and I need to take a look how to use it on my Linux computer. You could find me directly on Skype, you can try this link. https://join.skype.com/invite/hign90IeLxg0

Thank you for such a great work and reply, and I hope to be able to communicate with you in the future