IanYeung / RealVSR

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"
Apache License 2.0
109 stars 7 forks source link

CUDA_OUT OF MEMARY #6

Open dzz416 opened 2 years ago

dzz416 commented 2 years ago

I only have one titan12G, batchsize is set to 1, but I still get an error, what should I do?

IanYeung commented 2 years ago

You can reduce the training patch size or design a lighter model instead.

dzz416 commented 2 years ago

You can reduce the training patch size or design a lighter model instead.

it seems that is not the lack of memory:

21-12-07 21:38:19.848 - INFO: Model [VideoSRModel] is created. 21-12-07 21:38:19.848 - INFO: Start training from epoch: 0, iter: 0 /home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:134: UserWarning: Detected call of lr_scheduler.step() before optimizer.step(). In PyTorch 1.1.0 and later, you should call them in the opposite order: optimizer.step() before lr_scheduler.step(). Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning) Traceback (most recent call last): File "codes/train.py", line 339, in main() File "codes/train.py", line 162, in main model.optimize_parameters(current_step) File "/home1/dzzHD/RealVSR/codes/models/VideoSR_AllPair_model_YCbCr_Split.py", line 175, in optimize_parameters self.fake_H = self.netG(self.var_L) File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 158, in forward inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids) File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 175, in scatter return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim) File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 44, in scatter_kwargs inputs = scatter(inputs, target_gpus, dim) if inputs else [] File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 36, in scatter res = scatter_map(inputs) File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 23, in scatter_map return list(zip(map(scatter_map, obj))) File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/parallel/scatter_gather.py", line 19, in scatter_map return Scatter.apply(target_gpus, None, dim, obj) File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/parallel/_functions.py", line 96, in forward outputs = comm.scatter(input, target_gpus, chunk_sizes, ctx.dim, streams) File "/home/dzz/.conda/envs/realvsr/lib/python3.7/site-packages/torch/nn/parallel/comm.py", line 189, in scatter return tuple(torch._C._scatter(tensor, devices, chunk_sizes, dim, streams)) RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

dzz416 commented 2 years ago

i m runing edvr_realvsr_notsa_split.yml

dzz416 commented 2 years ago

what should i do