liuqk3 / PUT

Paper 'Transformer based Pluralistic Image Completion with Reduced Information Loss' in TPAMI 2024 and 'Reduce Information Loss in Transformers for Pluralistic Image Inpainting' in CVPR2022
MIT License
173 stars 15 forks source link

A question about the inference #31

Open youyou0805 opened 8 months ago

youyou0805 commented 8 months ago

Hello, Thanks for the excellent code. I face a problem with the inference. When I run the below code for image inpainting with provided transformer model:

‘python scripts/inference_inpainting.py --func inference_inpainting --name transformer_ffhq --image_dir data/image.png --mask_dir data/mask.png --save_dir sample --input_res 512,512’

The output is two blank txt files, as shown in the figure below: QQ图片20240104211313 Could you help me identify where the problem might be occurring? Your help is greatly appreciated!

liuqk3 commented 8 months ago

@youyou0805 Thanks for your interest in our project. I do not see any errors in your provided screenshot. But two things are strange.

(1) You only provided one pair of image and mask. It is better to specify one GPU to use (such as --gpu 0). It seems that the script finds two GPUs in your machine.

(2) Currently, the publicly available code only support the resolution with 256x256. 512 x 512 is not supported.

You can have a try by fixing above two things.

youyou0805 commented 8 months ago

Thanks for your reply, my question has been resolved!

boyu-chen-intern commented 8 months ago

Hello, how to keep the size of the original image, now the image output by using the Simpler Inference method to call the model is only 256*256, which is not very clear, thank you.

liuqk3 commented 8 months ago

Hi @boyu-chen-intern , the P-VQVAE is compatible with different image sizes. But the UQ-Transformer is dedicated to the sequence with 1024=32x32 length. Hence, the model can not inpaint images that have other sizes except for 256x256.

boyu-chen-intern commented 8 months ago

Thank you for your reply and wonderful work