XPixelGroup / HAT

CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer Arxiv - HAT: Hybrid Attention Transformer for Image Restoration
Apache License 2.0
1.2k stars 147 forks source link

TypeError: calculate_psnr() missing 1 required positional argument: 'img2' #128

Closed AIconfig closed 7 months ago

AIconfig commented 7 months ago

I tried to run HAT and it seems that HAT-L_SRx4_ImageNet-pretrain.yml test throws an exception. However, I don't understand it enough to find any clues on what it could mean. Thus, I'm posting it here in hopes someone had this issue before and could fix it somehow.

Steps to reproduce:

git clone https://github.com/XPixelGroup/HAT
cd HAT
conda create -n comfy python=3.10

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

pip install -r requirements.txt
python setup.py develop

Download and place models into ./experiments folder.

Make test copy: cp options/test/HAT_SRx4_ImageNet-pretrain.yml options/test/HAT_SRx4_ImageNet-pretrain_custom.yml. Change HAT_SRx4_ImageNet-pretrain_custom.yml to include following (I need it because I have only 16 GB VRAM and it seems without tile_size: 256 I get OOM, but with tile_size: 256 it works):

tile: # use the tile mode for limited GPU memory when testing.
  tile_size: 256 # the higher, the more utilized GPU memory and the less performance change against the full image. must be an integer multiple of the window size.
  tile_pad: 32 # overlapping between adjacency patches.must be an integer multiple of the window size.

datasets:
  test_1:  # the 1st test dataset
    name: hatL4x
    type: SingleImageDataset
    dataroot_lq: /srv/shared/AI/AUTOMATIC1111/Upscalers/HAT/datasets/test2/
    io_backend:
      type: disk

Execute test: python hat/test.py -opt options/test/HAT-L_SRx4_ImageNet-pretrain_custom.yml

In my case it results in the following error:

/home/user/miniconda3/envs/hat/lib/python3.10/site-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
  warnings.warn(
Disable distributed.
Path already exists. Rename it to /srv/shared/AI/AUTOMATIC1111/Upscalers/HAT/results/HAT-L_SRx4_ImageNet-pretrain_archived_20240223_012755
2024-02-23 01:27:55,866 INFO: 
                ____                _       _____  ____
               / __ ) ____ _ _____ (_)_____/ ___/ / __ \
              / __  |/ __ `// ___// // ___/\__ \ / /_/ /
             / /_/ // /_/ /(__  )/ // /__ ___/ // _, _/
            /_____/ \__,_//____//_/ \___//____//_/ |_|
     ______                   __   __                 __      __
    / ____/____   ____   ____/ /  / /   __  __ _____ / /__   / /
   / / __ / __ \ / __ \ / __  /  / /   / / / // ___// //_/  / /
  / /_/ // /_/ // /_/ // /_/ /  / /___/ /_/ // /__ / /<    /_/
  \____/ \____/ \____/ \____/  /_____/\____/ \___//_/|_|  (_)

Version Information: 
    BasicSR: 1.3.4.9
    PyTorch: 2.0.1+cu118
    TorchVision: 0.15.2+cu118
2024-02-23 01:27:55,866 INFO: 
  name: HAT-L_SRx4_ImageNet-pretrain
  model_type: HATModel
  scale: 4
  num_gpu: 1
  manual_seed: 0
  tile:[
    tile_size: 256
    tile_pad: 32
  ]
  datasets:[
    test_1:[
      name: hatL4x
      type: SingleImageDataset
      dataroot_lq: /srv/shared/AI/AUTOMATIC1111/Upscalers/HAT/datasets/test2/
      io_backend:[
        type: disk
      ]
      phase: test
      scale: 4
    ]
  ]
  network_g:[
    type: HAT
    upscale: 4
    in_chans: 3
    img_size: 64
    window_size: 16
    compress_ratio: 3
    squeeze_factor: 30
    conv_scale: 0.01
    overlap_ratio: 0.5
    img_range: 1.0
    depths: [6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
    embed_dim: 180
    num_heads: [6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]
    mlp_ratio: 2
    upsampler: pixelshuffle
    resi_connection: 1conv
  ]
  path:[
    pretrain_network_g: ./experiments/pretrained_models/HAT-L_SRx4_ImageNet-pretrain.pth
    strict_load_g: True
    param_key_g: params_ema
    results_root: /srv/shared/AI/AUTOMATIC1111/Upscalers/HAT/results/HAT-L_SRx4_ImageNet-pretrain
    log: /srv/shared/AI/AUTOMATIC1111/Upscalers/HAT/results/HAT-L_SRx4_ImageNet-pretrain
    visualization: /srv/shared/AI/AUTOMATIC1111/Upscalers/HAT/results/HAT-L_SRx4_ImageNet-pretrain/visualization
  ]
  val:[
    save_img: True
    suffix: None
    metrics:[
      psnr:[
        type: calculate_psnr
        crop_border: 4
        test_y_channel: True
      ]
      ssim:[
        type: calculate_ssim
        crop_border: 4
        test_y_channel: True
      ]
    ]
  ]
  dist: False
  rank: 0
  world_size: 1
  auto_resume: False
  is_train: False

2024-02-23 01:27:55,866 INFO: Dataset [SingleImageDataset] - hatL4x is built.
2024-02-23 01:27:55,867 INFO: Number of test images in hatL4x: 1
/home/user/miniconda3/envs/hat/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
2024-02-23 01:27:56,422 INFO: Network [HAT] is created.
2024-02-23 01:27:56,594 INFO: Network: HAT, with parameters: 40,846,575
2024-02-23 01:27:56,594 INFO: HAT(
  (conv_first): Conv2d(3, 180, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (patch_embed): PatchEmbed(
    (norm): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
  )
  (patch_unembed): PatchUnEmbed()
  (pos_drop): Dropout(p=0.0, inplace=False)
  (layers): ModuleList(
    (0): RHAG(
      (residual_group): AttenBlocks(
        (blocks): ModuleList(
          (0): HAB(
            (norm1): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
            (attn): WindowAttention(
              (qkv): Linear(in_features=180, out_features=540, bias=True)
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Linear(in_features=180, out_features=180, bias=True)
              (proj_drop): Dropout(p=0.0, inplace=False)
              (softmax): Softmax(dim=-1)
            )
            (conv_block): CAB(
              (cab): Sequential(
                (0): Conv2d(180, 60, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
                (1): GELU(approximate='none')
                (2): Conv2d(60, 180, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
                (3): ChannelAttention(
                  (attention): Sequential(
                    (0): AdaptiveAvgPool2d(output_size=1)
                    (1): Conv2d(180, 6, kernel_size=(1, 1), stride=(1, 1))
                    (2): ReLU(inplace=True)
                    (3): Conv2d(6, 180, kernel_size=(1, 1), stride=(1, 1))
                    (4): Sigmoid()
                  )
                )
              )
            )
            (drop_path): Identity()
            (norm2): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
            (mlp): Mlp(
              (fc1): Linear(in_features=180, out_features=360, bias=True)
              (act): GELU(approximate='none')
              (fc2): Linear(in_features=360, out_features=180, bias=True)
              (drop): Dropout(p=0.0, inplace=False)
            )
          )
          (1-5): 5 x HAB(
            (norm1): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
            (attn): WindowAttention(
              (qkv): Linear(in_features=180, out_features=540, bias=True)
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Linear(in_features=180, out_features=180, bias=True)
              (proj_drop): Dropout(p=0.0, inplace=False)
              (softmax): Softmax(dim=-1)
            )
            (conv_block): CAB(
              (cab): Sequential(
                (0): Conv2d(180, 60, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
                (1): GELU(approximate='none')
                (2): Conv2d(60, 180, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
                (3): ChannelAttention(
                  (attention): Sequential(
                    (0): AdaptiveAvgPool2d(output_size=1)
                    (1): Conv2d(180, 6, kernel_size=(1, 1), stride=(1, 1))
                    (2): ReLU(inplace=True)
                    (3): Conv2d(6, 180, kernel_size=(1, 1), stride=(1, 1))
                    (4): Sigmoid()
                  )
                )
              )
            )
            (drop_path): DropPath()
            (norm2): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
            (mlp): Mlp(
              (fc1): Linear(in_features=180, out_features=360, bias=True)
              (act): GELU(approximate='none')
              (fc2): Linear(in_features=360, out_features=180, bias=True)
              (drop): Dropout(p=0.0, inplace=False)
            )
          )
        )
        (overlap_attn): OCAB(
          (norm1): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
          (qkv): Linear(in_features=180, out_features=540, bias=True)
          (unfold): Unfold(kernel_size=(24, 24), dilation=1, padding=4, stride=16)
          (softmax): Softmax(dim=-1)
          (proj): Linear(in_features=180, out_features=180, bias=True)
          (norm2): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
          (mlp): Mlp(
            (fc1): Linear(in_features=180, out_features=360, bias=True)
            (act): GELU(approximate='none')
            (fc2): Linear(in_features=360, out_features=180, bias=True)
            (drop): Dropout(p=0.0, inplace=False)
          )
        )
      )
      (conv): Conv2d(180, 180, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (patch_embed): PatchEmbed()
      (patch_unembed): PatchUnEmbed()
    )
    (1-11): 11 x RHAG(
      (residual_group): AttenBlocks(
        (blocks): ModuleList(
          (0-5): 6 x HAB(
            (norm1): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
            (attn): WindowAttention(
              (qkv): Linear(in_features=180, out_features=540, bias=True)
              (attn_drop): Dropout(p=0.0, inplace=False)
              (proj): Linear(in_features=180, out_features=180, bias=True)
              (proj_drop): Dropout(p=0.0, inplace=False)
              (softmax): Softmax(dim=-1)
            )
            (conv_block): CAB(
              (cab): Sequential(
                (0): Conv2d(180, 60, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
                (1): GELU(approximate='none')
                (2): Conv2d(60, 180, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
                (3): ChannelAttention(
                  (attention): Sequential(
                    (0): AdaptiveAvgPool2d(output_size=1)
                    (1): Conv2d(180, 6, kernel_size=(1, 1), stride=(1, 1))
                    (2): ReLU(inplace=True)
                    (3): Conv2d(6, 180, kernel_size=(1, 1), stride=(1, 1))
                    (4): Sigmoid()
                  )
                )
              )
            )
            (drop_path): DropPath()
            (norm2): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
            (mlp): Mlp(
              (fc1): Linear(in_features=180, out_features=360, bias=True)
              (act): GELU(approximate='none')
              (fc2): Linear(in_features=360, out_features=180, bias=True)
              (drop): Dropout(p=0.0, inplace=False)
            )
          )
        )
        (overlap_attn): OCAB(
          (norm1): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
          (qkv): Linear(in_features=180, out_features=540, bias=True)
          (unfold): Unfold(kernel_size=(24, 24), dilation=1, padding=4, stride=16)
          (softmax): Softmax(dim=-1)
          (proj): Linear(in_features=180, out_features=180, bias=True)
          (norm2): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
          (mlp): Mlp(
            (fc1): Linear(in_features=180, out_features=360, bias=True)
            (act): GELU(approximate='none')
            (fc2): Linear(in_features=360, out_features=180, bias=True)
            (drop): Dropout(p=0.0, inplace=False)
          )
        )
      )
      (conv): Conv2d(180, 180, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (patch_embed): PatchEmbed()
      (patch_unembed): PatchUnEmbed()
    )
  )
  (norm): LayerNorm((180,), eps=1e-05, elementwise_affine=True)
  (conv_after_body): Conv2d(180, 180, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_before_upsample): Sequential(
    (0): Conv2d(180, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.01, inplace=True)
  )
  (upsample): Upsample(
    (0): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): PixelShuffle(upscale_factor=2)
    (2): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): PixelShuffle(upscale_factor=2)
  )
  (conv_last): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
2024-02-23 01:27:56,721 INFO: Loading HAT model from ./experiments/pretrained_models/HAT-L_SRx4_ImageNet-pretrain.pth, with param key: [params_ema].
2024-02-23 01:27:56,911 INFO: Model [HATModel] is created.
2024-02-23 01:27:56,911 INFO: Testing hatL4x...
/home/user/miniconda3/envs/hat/lib/python3.10/site-packages/torch/nn/modules/conv.py:459: UserWarning: Applied workaround for CuDNN issue, install nvrtc.so (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:80.)
  return F.conv2d(input, weight, bias, self.stride,
    Tile 1/25
    Tile 2/25
    Tile 3/25
    Tile 4/25
    Tile 5/25
    Tile 6/25
    Tile 7/25
    Tile 8/25
    Tile 9/25
    Tile 10/25
    Tile 11/25
    Tile 12/25
    Tile 13/25
    Tile 14/25
    Tile 15/25
    Tile 16/25
    Tile 17/25
    Tile 18/25
    Tile 19/25
    Tile 20/25
    Tile 21/25
    Tile 22/25
    Tile 23/25
    Tile 24/25
    Tile 25/25
Traceback (most recent call last):
  File "/srv/shared/AI/AUTOMATIC1111/Upscalers/HAT/hat/test.py", line 11, in <module>
    test_pipeline(root_path)
  File "/home/user/miniconda3/envs/hat/lib/python3.10/site-packages/basicsr/test.py", line 40, in test_pipeline
    model.validation(test_loader, current_iter=opt['name'], tb_logger=None, save_img=opt['val']['save_img'])
  File "/home/user/miniconda3/envs/hat/lib/python3.10/site-packages/basicsr/models/base_model.py", line 48, in validation
    self.nondist_validation(dataloader, current_iter, tb_logger, save_img)
  File "/srv/shared/AI/AUTOMATIC1111/Upscalers/HAT/hat/models/hat_model.py", line 172, in nondist_validation
    self.metric_results[name] += calculate_metric(metric_data, opt_)
  File "/home/user/miniconda3/envs/hat/lib/python3.10/site-packages/basicsr/metrics/__init__.py", line 19, in calculate_metric
    metric = METRIC_REGISTRY.get(metric_type)(**data, **opt)

In anyone has any tips on how can it be fixed (maybe by installing different pytorch version or something else), please, share you findings. Thank you so much!

chxy95 commented 7 months ago

@AIconfig Comment out the 'metric' part. Cannot calculate the psnr/ssim metrics for non-reference single image.

AIconfig commented 7 months ago

@AIconfig Comment out the 'metric' part. Cannot calculate the psnr/ssim metrics for non-reference single image.

Thank you so much! I confirm it works after commenting out metric part.