Algolzw / BSRT

Pytorch code for "BSRT: Improving Burst Super-Resolution with Swin Transformer and Flow-Guided Deformable Alignment", CVPRW, 1st place in NTIRE 2022 BurstSR Challenge (real-world track).
MIT License
179 stars 13 forks source link

Improving Model for Text Super-Resolution #17

Open raphischwarz opened 3 months ago

raphischwarz commented 3 months ago

Dear author, first and foremost thank you very much for your great work and your great effort towards improving burst super-resolution. I am currently myself training and testing your model on printed text super-resolution, super market price tags to be more specific. The images basically consist of black text on white or yellow background, I can add some sample images if needed.

My main improvements have been achieved by optimizing the input data quality and collecting real-world training data closely related to my problem but I have been asking myself if there might be some improvements to the model itself which would help to further boost the performance on real-world printed text. My biggest problem is that the model achieves really good results on setups represented in the training data, but does not generalize to well for different light conditions or text color/background color combinations.

From what I have tested until now, larger input crops do improve the super-resolution quality. Have you investigated some architectural changes, for example increasing the depth and likewise the number of heads or the window size, and achieved improved results? In my case, computational resources for the model training are no limitation.

Thank you very much in advance for your help and thanks again for your excellent work!

Bhavik-Ardeshna commented 1 month ago

@raphischwarz Hi, I am also working on the same use case and trying to set up BSRT. I am facing an issue in running the code and setting up the dataset. Can you help with that?

Thank you.