tsunghan-wu / RandLA-Net-pytorch

:four_leaf_clover: Pytorch Implementation of RandLA-Net (https://arxiv.org/abs/1911.11236)
MIT License
122 stars 33 forks source link

Excessive memory requirements #10

Open huixiancheng opened 3 years ago

huixiancheng commented 3 years ago

Hi, I would like to know how much memory you need for testing SemanticKITTI. When setting batch=1, I need almost 32G of memory (not GPU memory). Is this normal? Or is there any way to reduce that demand?

fenfenglitech commented 2 years ago

hi@huixiancheng,if you test succsesfully?i got some problems when i was testing ,i can't do the testing process on sequences13,19,and 21,however other sequences are successed. could you give me some advices?

huixiancheng commented 2 years ago

May caused by out of memory. A simple way to solve this is just use Slice in here. https://github.com/tsunghan-mama/RandLA-Net-pytorch/blob/913837e846176e4247a7e21783bf8f2f38576257/dataset/semkitti_testset.py#L26

Such as 4071 in seq 08. Just infer two time. Rough but effective and not impact on accuracy in my tests Once is self.data_list = sorted(self.data_list)[0:3000]. Then ifer again in self.data_list = sorted(self.data_list)[3000:]

fenfenglitech commented 2 years ago

thank you for your help,and i have solved this problem. however,i test the score on the competition failed,like this:

how can i do?

------------------ 原始邮件 ------------------ 发件人: "tsunghan-mama/RandLA-Net-pytorch" @.>; 发送时间: 2021年12月11日(星期六) 中午12:20 @.>; @.**@.>; 主题: Re: [tsunghan-mama/RandLA-Net-pytorch] Excessive memory requirements (#10)

May caused by out of memory. A simple way to solve this is just use Slice in here. https://github.com/tsunghan-mama/RandLA-Net-pytorch/blob/913837e846176e4247a7e21783bf8f2f38576257/dataset/semkitti_testset.py#L26

Such as 4071 in seq 08. Just infer two time. Rough but effective and not impact on accuracy in my tests Once is self.data_list = sorted(self.data_list)[0:3000]. Then ifer again in self.data_list = sorted(self.data_list)[3000:]

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

fenfenglitech commented 2 years ago

thank you for your help,and i have solved this problem. however,i test the score on the competition failed,like this:

hoping you could give me some suggestions.thank you very much! adding:   my work is based on original RandLA-Net code,and i appends my labels files at last.

------------------ 原始邮件 ------------------ 发件人: "tsunghan-mama/RandLA-Net-pytorch" @.>; 发送时间: 2021年12月11日(星期六) 中午12:20 @.>; @.**@.>; 主题: Re: [tsunghan-mama/RandLA-Net-pytorch] Excessive memory requirements (#10)

May caused by out of memory. A simple way to solve this is just use Slice in here. https://github.com/tsunghan-mama/RandLA-Net-pytorch/blob/913837e846176e4247a7e21783bf8f2f38576257/dataset/semkitti_testset.py#L26

Such as 4071 in seq 08. Just infer two time. Rough but effective and not impact on accuracy in my tests Once is self.data_list = sorted(self.data_list)[0:3000]. Then ifer again in self.data_list = sorted(self.data_list)[3000:]

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

从QQ邮箱发来的超大附件

mlpmixer_basic.zip (134.63M, 2022年01月18日 16:07 到期)进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=73303436f1d2f6c4f1745b091336511c515303070652025219075706031b57555008190e0255571e0d0056040652530052050d533524635e5840595f4d53116c5651475f5618195a443009&code=404656c3

huixiancheng commented 2 years ago

I haven't used the original code so I can't give advice. Also, all you need to be aware of is the error log given by codalab.
May be you can try to get help in here.

fenfenglitech commented 2 years ago

what is your environments,i want to try run your code.

huixiancheng commented 2 years ago

Just this repo with infer in "all" type.

I did not submit a test, I think if there is no problem with this api verification in valid set, the test is also no problem.

fenfenglitech commented 2 years ago

hi,   i want to appreciate your suggestions.   my problems got solved with your help and now my work get a great score .thank you very much!!!

------------------ 原始邮件 ------------------ 发件人: "tsunghan-mama/RandLA-Net-pytorch" @.>; 发送时间: 2021年12月19日(星期天) 晚上6:44 @.>; @.**@.>; 主题: Re: [tsunghan-mama/RandLA-Net-pytorch] Excessive memory requirements (#10)

Just this repo with infer in "all" type.

I did not submit a test, I think if there is no problem with this api verification in valid set, the test is also no problem.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>

xlr-project commented 2 years ago

hi, @huixiancheng,i have run data_prepare_semantickitti.py successfully, but when i train the model it was wrong, the error is: RuntimeError: weight tensor should be defined either for all 19 classes or no classes but got weight tensor of shape: [1, 19], how can i do?

huixiancheng commented 2 years ago

Hi, I do not meet this problem. Maybe You should check the number of classes and classes_weights.

Here is the weight I ever caculated and used.

class_weights = torch.tensor([[17.1775, 49.4507, 49.0822, 45.9186, 44.9319, 49.0655, 49.6848, 49.8644, 5.3651, 31.3473, 7.2697, 41.0090, 5.5935, 11.1401, 2.8727, 37.3551, 9.1705, 43.3172, 48.0677]]).cuda()

It really a tensor of shape: torch.Size([1, 19]).

xlr-project commented 2 years ago

@huixiancheng, thank you very much for your data and advice, i try it but still can not work. Do you think maybe this problem has relation with checkpoint.rar? because i can't gei it from your link in readme.md. it was empty.

huixiancheng commented 2 years ago

No. I think it will not effect.

xlr-project commented 2 years ago

@huixiancheng i am very grateful for you give me advices, i will try it again, thank you very much

huixiancheng commented 2 years ago

@xlr-project Maybe you use torch=1.10? I just reprodece your errors with this setting(torch=1.10 with cuda=11.3 ). When change to torch=1.81 and cuda=11.1. It work well.

xlr-project commented 2 years ago

@xlr-project Maybe you use torch=1.10? I just reprodece your errors with this setting(torch=1.10 with cuda=11.3 ). When change to torch=1.81 and cuda=11.1. It work well.

thank you very much, i make it successfully already