haofeixu / aanet

[CVPR'20] AANet: Adaptive Aggregation Network for Efficient Stereo Matching
Apache License 2.0
521 stars 100 forks source link

RuntimeError: CUDA error: an illegal memory access was encountered #80

Closed q5390498 closed 1 year ago

q5390498 commented 2 years ago

`[2022-01-25 19:41:34,285] => Loading pretrained AANet: pretrained/aanet+_kitti15-2075aea1.pth [2022-01-25 19:41:34,403] => Number of trainable parameters: 8514130 [2022-01-25 19:41:34,406] => Start training... Traceback (most recent call last): File "train.py", line 246, in <module> main() File "train.py", line 236, in main train_model.train(train_loader) File "/home/yunhuizhang/project/aanet/model.py", line 68, in train pred_disp_pyramid = self.aanet(left, right) # list of H/12, H/6, H/3, H/2, H File "/home/yunhuizhang/anaconda3/envs/aanet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/yunhuizhang/project/aanet/nets/aanet.py", line 207, in forward left_feature = self.feature_extraction(left_img) File "/home/yunhuizhang/project/aanet/nets/aanet.py", line 134, in feature_extraction feature = self.feature_extractor(img) File "/home/yunhuizhang/anaconda3/envs/aanet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/yunhuizhang/project/aanet/nets/feature.py", line 426, in forward x = self.conv_start(x) File "/home/yunhuizhang/anaconda3/envs/aanet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/yunhuizhang/anaconda3/envs/aanet/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/home/yunhuizhang/anaconda3/envs/aanet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/yunhuizhang/project/aanet/nets/deform.py", line 73, in forward offset_mask = self.offset_conv(x) File "/home/yunhuizhang/anaconda3/envs/aanet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/yunhuizhang/anaconda3/envs/aanet/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 343, in forward return self.conv2d_forward(input, self.weight) File "/home/yunhuizhang/anaconda3/envs/aanet/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 340, in conv2d_forward self.padding, self.dilation, self.groups) RuntimeError: CUDA error: an illegal memory access was encountered (aanet)

我跑训练的时候遇到这个问题,请问是什么原因呀? 我的cuda是11.3,其他的环境是用readme的conda配置的。感谢~

haofeixu commented 1 year ago

Hi @q5390498 , sorry for the late response.

If this issue is still relavant to you, I would suggest to try our new GMStereo model: https://haofeixu.github.io/unimatch/ & https://github.com/autonomousvision/unimatch. No CUDA op is required. A Colab demo is also provided to try our model in your browser. Hope it helps, thanks.