zengkun301 / DCTLSA

Densely Connected Transformer with Linear Self-Attention for Lightweight Image Super-Resolution
9 stars 0 forks source link

Network Feedback #1

Open muslll opened 9 months ago

muslll commented 9 months ago

Hi, first of all thanks to everyone that worked on DCTLSA. I want to give some feedback regarding this project: I've added it to neosr and trained both bicubic and realistic models with it. However, I made two small changes to it: replaced the attention function with scaled_dot_product_attention to improve training speeds, and added dropout after out_B (per research findings of 'Reflash Dropout'). The comparisons bellow are from a model trained on downscaling algorithms only (to be specific nearest, bilinear, bicubic, lanczos and mitchell), using VGG Perceptual loss, LDL and at end of training FocalFrequency . The weights have been released for public use (CC0 license).

dctlsa_cmp_4

dctlsa_cmp_2

dctlsa_cmp_3

I also tested DCTLSA on complex realistic degradations (noise, compression, blur), and it performed very well:

dctlsa_cmp_anime_2

dctlsa_cmp_anime_0

dctlsa_cmp_anime_1

DCTLSA is a very training efficient network. Some areas for improvements I noticed:

Thanks again to everyone that worked on this project.

Shiqi72 commented 3 months ago

你好,我想请问一下Flops是如何计算的?我在复现代码时×2得到的flops与论文中不一致 _model = model.Model(args, checkpoint) input = torch.randn(1, 3, 170, 170).cuda() flops, params = profile(_model, inputs=(input, 0)) print("flops", str(flops / 1e9)) print("params", str(params / 1e6))

zengkun301 commented 3 months ago

你好,我想请问一下Flops是如何计算的?我在复现代码时×2得到的flops与论文中不一致 _model = model.Model(args, checkpoint) input = torch.randn(1, 3, 170, 170).cuda() flops, params = profile(_model, inputs=(input, 0)) print("flops", str(flops / 1e9)) print("params", str(params / 1e6))

你好。计算×2得到的flops,为了保证输出尺寸为1280x720,输入维度应为(1,3,640,360)。

Shiqi72 commented 3 months ago

你好,我想请问一下Flops是如何计算的?我在复现代码时×2得到的flops与论文中不一致 _model = model.Model(args, checkpoint) input = torch.randn(1, 3, 170, 170).cuda() flops, params = profile(_model, inputs=(input, 0)) print("flops", str(flops / 1e9)) print("params", str(params / 1e6))

你好。计算×2得到的flops,为了保证输出尺寸为1280x720,输入维度应为(1,3,640,360)。

非常感谢您的解答,我在阅读论文时还有一个疑问,我想问一下SA可以提取全局特征,那在LFE阶段加入3×3深度卷积层扩大感受野具体有什么作用呢?

zengkun301 commented 2 months ago

你好,我想请问一下Flops是如何计算的?我在复现代码时×2得到的flops与论文中不一致 _model = model.Model(args, checkpoint) input = torch.randn(1, 3, 170, 170).cuda() flops, params = profile(_model, inputs=(input, 0)) print("flops", str(flops / 1e9)) print("params", str(params / 1e6))

你好。计算×2得到的flops,为了保证输出尺寸为1280x720,输入维度应为(1,3,640,360)。

非常感谢您的解答,我在阅读论文时还有一个疑问,我想问一下SA可以提取全局特征,那在LFE阶段加入3×3深度卷积层扩大感受野具体有什么作用呢?

您好。LFE阶段加入3×3深度卷积层目的是扩大感受野的同时不引入更多的参数量(与一般卷积相比)。SA理论上能够提取全局特征,但实际应用中捕获全局信息能力有限。