JingyunLiang / VRT

VRT: A Video Restoration Transformer (official repository)
https://arxiv.org/abs/2201.12288
Other
1.35k stars 126 forks source link

CUDNN_STATUS_INTERNAL_ERROR with my own set #28

Open Entretoize opened 2 years ago

Entretoize commented 2 years ago

Hello, I'm able to run VRT with provide sets then I tried with my own and it doesn't work I get this error:

(py38) H:\git\VRT>python main_test_vrt.py --task 008_VRT_videodenoising_DAVIS --sigma 10 --folder_lq testsets/mine --tile 8 160 180 --tile_overlap 2 20 20
h:\Anaconda3\envs\py38\lib\site-packages\torch\functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\TensorShape.cpp:2228.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
H:\git\VRT\models\network_vrt.py:716: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  dim_t = temperature ** (2 * (dim_t // 2) / num_pos_feats)
loading model from ./model_zoo/vrt/model_zoo/vrt/008_VRT_videodenoising_DAVIS.pth
using dataset from testsets/mine
Traceback (most recent call last):
  File "main_test_vrt.py", line 346, in <module>
    main()
  File "main_test_vrt.py", line 72, in main
    output = test_video(lq, model, args)
  File "main_test_vrt.py", line 257, in test_video
    out_clip = test_clip(lq_clip, model, args)
  File "main_test_vrt.py", line 308, in test_clip
    out_patch = model(in_patch).detach().cpu()
  File "h:\Anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "H:\git\VRT\models\network_vrt.py", line 1382, in forward
    flows_backward, flows_forward = self.get_flows(x)
  File "H:\git\VRT\models\network_vrt.py", line 1413, in get_flows
    flows_backward, flows_forward = self.get_flow_2frames(x)
  File "H:\git\VRT\models\network_vrt.py", line 1436, in get_flow_2frames
    flows_backward = self.spynet(x_1, x_2)
  File "h:\Anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "H:\git\VRT\models\network_vrt.py", line 438, in forward
    flow_list = self.process(ref, supp, w, h, w_floor, h_floor)
  File "H:\git\VRT\models\network_vrt.py", line 412, in process
    flow = self.basic_module[level](torch.cat([
  File "h:\Anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "H:\git\VRT\models\network_vrt.py", line 356, in forward
    return self.basic_module(tensor_input)
  File "h:\Anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "h:\Anaconda3\envs\py38\lib\site-packages\torch\nn\modules\container.py", line 141, in forward
    input = module(input)
  File "h:\Anaconda3\envs\py38\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "h:\Anaconda3\envs\py38\lib\site-packages\torch\nn\modules\conv.py", line 447, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "h:\Anaconda3\envs\py38\lib\site-packages\torch\nn\modules\conv.py", line 443, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

I suppose this is the size of the images and/or the tile settings, I don't really understand how to choose that setting as there's 3 values, I think the second and third as the height and width of the tile but what is the first one, it doesn't need the count as it can be calculated so what ?

JingyunLiang commented 2 years ago

--tile 8 160 180 means frame=8, height=160 and width=180.

Entretoize commented 2 years ago

Ok, but what frame refers to ?

JingyunLiang commented 2 years ago

It crops the video into 8x160x180 clips for testing due to memory limit.

Entretoize commented 2 years ago

Then, it's strange that the first value is needed as it is = imagew*imageh/tilew/tileh I reduced my image to 320x180 and tried with --tile 4 160 90 which removed the previous error but now I have that:

RuntimeError: Given groups=1, weight of size [96, 28, 1, 3, 3], expected input[1, 27, 4, 160, 160] to have 28 channels, but got 27 channels instead

JingyunLiang commented 2 years ago

If you don't want to partition the video along the temporal dimension, you can use --tile 0 160 90.

As for the runtimeerror, please refer to https://github.com/JingyunLiang/VRT/issues/24

Entretoize commented 2 years ago

O..k... what does "partition the video along temporal dimension" mean ? I know that your method uses several frame to work, then is that what this value is ? If I set 0 what will it do ?

About the error, I understand that I need to give a GT path for it to work, but if I have the GT then the script is useless so there's something I missed...

JingyunLiang commented 2 years ago

Partition means partition the sequence into short clips. 0 means no partion.