the input_size and out_size of big-lama

advimman / lama

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

https://advimman.github.io/lama-project/

Apache License 2.0

7.99k stars 849 forks source link

the input_size and out_size of big-lama #83

Open ray0809 opened 2 years ago

ray0809 commented 2 years ago

Hi, Firstly, Thank you for making such a great project open source. I found the out_size in released big-lama config.yaml is 256, was the big-lama model trained with images' size 256?

windj007 commented 2 years ago

Hi! All LaMa models in the paper were trained using 256x256 crops from Places. Original resolution of images in Places is approximately 512

windj007 commented 2 years ago

Feel free to reopen the issue if you have further questions

hjq133 commented 2 years ago

Hi, @windj007. When using Places for training, why doesn't lama scale the image to 256 before cropping, is it more meaningful than directly making 256x256 crops ?

windj007 commented 2 years ago

Due to the nature of convolutions, the networks adapt to the scale of objects/textures - and perform best on exactly that scale. Inpainting in 256 is not very interesting in practice - so why optimize methods for such low resolution? Original resolution of images in Places is 512 - so we decided to keep that scale (i.e. average size of objects in pixels), but training directly in 512 is very expensive - so we used crops.