JingyunLiang / RVRT

Recurrent Video Restoration Transformer with Guided Deformable Attention (NeurlPS2022, official repository)
https://arxiv.org/abs/2206.02146
Other
354 stars 33 forks source link

RuntimeError: Given groups=1, weight of size [192, 4, 1, 3, 3], expected input[1, 3, 100, 64, 64] to have 4 channels, but got 3 channels instead #11

Open YoungofNUAA opened 1 year ago

YoungofNUAA commented 1 year ago

i give a .mp4 video as a input file , but code gaives me some error shown below:

RuntimeError: Given groups=1, weight of size [192, 4, 1, 3, 3], expected input[1, 3, 100, 64, 64] to have 4 channels, but got 3 channels instead

sr model runs well with images input , denosing model should input a video? is it right what is the forth channel with video?

imironhead commented 1 year ago

There are some bugs in this script: RVRT/data/dataset_video_test.py

For example, there are two class SingleVideoRecurrentTestDataset(data.Dataset): implementations.

To run denoising model, the noise level must be concatenated to the input images (I have not read the paper yet, but the source code is implemented that way), and that is where the 4-th channel comes from.

In my case (I ran the script with image sequence), add some code can fix the problem:

class SingleVideoRecurrentTestDataset(data.Dataset):
    def __init__(self, opt):
        ....
        self.sigma = opt['sigma'] / 255. if 'sigma' in opt else 0

    def __getitem__(self, index):
        ...
        if self.sigma:
            noise_level = torch.ones((1, 1, 1, 1)) * self.sigma
            t, _, h, w = imgs_lq.shape
            imgs_lq = torch.cat([imgs_lq, noise_level.expand(t, 1, h, w)], 1)

        return ...