ifnspaml / SGDepth

[ECCV 2020] Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance
MIT License
200 stars 26 forks source link

RuntimeError: CUDA error: an illegal memory access was encountered #7

Closed chetanmreddy closed 3 years ago

chetanmreddy commented 3 years ago

Any idea why this might be happening?

`Starting initialization Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /root/.cache/torch/checkpoints/resnet18-5c106cde.pth Loading training dataset metadata:

klingner commented 3 years ago

Usually I do not get such errors, however cuda errors are somewhat cryptical in most cases. Did you change anything in the code with respect to the standard configuration?

Also: Did you already try to run the code on a CPU, just to verify that it is running? If you get an error on the CPU sometimes it is more readable than on the GPU. The code automatically detects if it can be run on GPU and if not available it runs on the CPU.

chetanmreddy commented 3 years ago

I did not change anything in the code except for correct paths to data.

I tried running on CPU and it trains without any errors. For now, I will keep this issue open and try to debug around. Thank You.

chetanmreddy commented 3 years ago

After following some solutions on PyTorch discussions forum, the error is changed to this:

Traceback (most recent call last): File "train.py", line 372, in trainer.train() File "train.py", line 340, in train self._run_epoch() File "train.py", line 249, in _run_epoch outputs = model(batch) File "/opt/conda/envs/torch_110/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "/cmudi001-sgd-1/SGDepth/models/sgdepth.py", line 292, in forward x = self.seg(x) File "/opt/conda/envs/torch_110/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/cmudi001-sgd-1/SGDepth/models/sgdepth.py", line 79, in forward x = self.decoder(x) File "/opt/conda/envs/torch_110/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/cmudi001-sgd-1/SGDepth/models/networks/partialdecoder.py", line 134, in forward x = self.blocks[f'step{step}'](x) File "/opt/conda/envs/torch_110/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/cmudi001-sgd-1/SGDepth/models/networks/partial_decoder.py", line 69, in forward x_new = torch.cat((x_new, x_skp), 1) RuntimeError: cuda runtime error (700) : an illegal memory access was encountered at /opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/THC/THCCachingHostAllocator.cpp:278

UPDATE: It seems like the error is happening in the segmentation part. Can you please elaborate on how to prepare cityscapes dataset?