XiangZ-0 / HiT-SR

[ECCV 2024 - Oral] HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
Apache License 2.0
82 stars 2 forks source link

great work! any chance to support fp16 inference? #1

Closed FlotingDream closed 4 months ago

FlotingDream commented 4 months ago

great work! any chance to support fp16 inference? thx!

XiangZ-0 commented 4 months ago

Hi, thanks for your interest. Our work is mainly built on BasicSR. Unfortunately, it seems that BasicSR doesn't support fp16 at the moment (https://github.com/XPixelGroup/BasicSR/issues/477#issuecomment-934294844).

FlotingDream commented 4 months ago

Hi, thanks for your interest. Our work is mainly built on BasicSR. Unfortunately, it seems that BasicSR doesn't support fp16 at the moment (XPixelGroup/BasicSR#477 (comment)).

I mean for HiT-srf (I only test this), use the pretrain model inference in fp16 will get black image. seems maybe some improve in the scc class select position bias part? thx

XiangZ-0 commented 4 months ago

I tried running fp16 inference and observed value overflow issues (inf or nan, which causes the black output results) when directly using the models pre-trained with fp32 precision. Since re-training with fp16 is not feasible with BasicSR, some possible solutions might be:

  1. Re-train the models with softmax layers or gradient clipping techniques to prevent the overflow problem. In HiT-SR we drop softmax layers for better performance, and this probably makes it difficult to do fp16 inference with fp32 models, where the correlation outputs might exceed 65504.
  2. Re-train the models with torch.cuda.amp to make the models more compatible with fp16 inference.

I am not an expert on this topic, but hope this helps

FlotingDream commented 4 months ago

Thanks for the quick reply, it was very helpful, appreciated!