XPixelGroup / HAT

CVPR2023 - Activating More Pixels in Image Super-Resolution Transformer Arxiv - HAT: Hybrid Attention Transformer for Image Restoration
Apache License 2.0
1.14k stars 134 forks source link

Does the Real-HAT have a small version like HAT-S? #126

Open ANYMS-A opened 5 months ago

ANYMS-A commented 5 months ago

Does the Real-HAT have a small version like HAT-S? SInce I want to deploy the HAT on a edge-computing device. But Real-HAT is too large to be deployed.

0MiDo0 commented 5 months ago

I met the same problem, the Real-HAT SRx4 requires too much VRAM. It would be better if the author releases a Real-HAT SRx2 version

wangxinchao-bit commented 2 months ago

Do you have trained the Real-HAT SRx2 ?

0MiDo0 commented 2 months ago

Do you have trained the Real-HAT SRx2 ?

Not yet, my GPU is not enough for training (RTX 4070Ti 12GB). If you want a good model, you need a large batch_size for training, 12GB is only enough for batch_size=1 (img_size 512)

wangxinchao-bit commented 2 months ago

Have you come across any smaller models that deliver slightly better performance? I'm interested in experimenting with a GAN. I saw that the author trained their network using a 2080 with a batch size of 4 for x4 super-resolution and achieved impressive results. However, I'm not sure about the duration of their training process.

0MiDo0 commented 2 months ago

Have you come across any smaller models that deliver slightly better performance? I'm interested in experimenting with a GAN. I saw that the author trained their network using a 2080 with a batch size of 4 for x4 super-resolution and achieved impressive results. However, I'm not sure about the duration of their training process.

Because my image for training is too large, which is 512*512, batch_size 1 takes 11.7 GB vram. I'm now using Real-ESRGAN, it's OK for me, but not as good as HAT. There is a new network called DRCT, you can have a look: https://github.com/ming053l/drct image

wangxinchao-bit commented 2 months ago

Thank you, I wander how long it spend for yout to make supre resolution for 512*512 images with real-esrgan?