Royalvice / DocDiff

ACM Multimedia 2023: DocDiff: Document Enhancement via Residual Diffusion Models. Also contains 1597 red seals in Chinese scenes, along with their corresponding binary masks.
https://www.aibupt.com/
MIT License
196 stars 21 forks source link

Inference on variable image size #11

Closed zairm21 closed 10 months ago

zairm21 commented 10 months ago

Hi, First off thanks for sharing your work. My question is that does DocDiff work on different document image sizes, e.g: image of an entire document page or it only works on small square patches?

Thanks!

Royalvice commented 10 months ago

Thank you for your interest in DocDiff's work. You can refer to Issue #7 for more information. DocDiff can perform inference on images with arbitrary width and height that are divisible by 8 (2**[number of downsampling iterations]).

zairm21 commented 10 months ago

I padded the image and it worked. Thanks @Royalvice