liuqk3 / PUT

Paper 'Transformer based Pluralistic Image Completion with Reduced Information Loss' in TPAMI 2024 and 'Reduce Information Loss in Transformers for Pluralistic Image Inpainting' in CVPR2022
MIT License
178 stars 15 forks source link

Poor inference results on own dataset #40

Open gaishi7 opened 3 months ago

gaishi7 commented 3 months ago

0b67ab49e96604b3e99fc949991133e 4369556c30c7ec5a744e15abe324004 db3289873aea61e95f9cf672bd667db Hello, when training on my own dataset and using the command python scripts/inference.py --func inference_inpainting --name OUTPUT/cvpr2022_transformer_ffhq/checkpoint/last.pth --input_res 256,256 --num_token_per_iter 100 --num_token_for_sampling 300 --num_replicate 1 --image_dir data/1 --mask_dir irregular-mask/2 --save_masked_image --save_dir out_images/cvpr2022_transformer_ffhq --num_sample 1 --gpu 0, the inference results are very poor. How can this be resolved?

liuqk3 commented 2 months ago

Thanks for your interest in our work. You need to make sure that (1) the P-VQVAE can well reconstruct your own images. From my experience, P-VQVAE may fails to reconstruct the textures in your provided example images; (2) the transformer can generate pleasant results on your training data. If the P-VQVAE fails to reconstruct the images well, it is hard for the transformer to produce pleasant inpainted images.

gaishi7 commented 1 month ago

Thank you for your response.This is a real image. cdc24ddb454f801166f00e080a2c493 This is a reconstructed image. 5ec69be271977f8182fdb654e48770f This is a restored image. add82eba34da72f06aa1af5f7f068dd Why does the reconstruction image look great, but the restoration result is poor?

gaishi7 commented 1 month ago

Thank you for your response.This is a real image. cdc24ddb454f801166f00e080a2c493 This is a reconstructed image. 5ec69be271977f8182fdb654e48770f This is a restored image. add82eba34da72f06aa1af5f7f068dd Why does the reconstruction image look great, but the restoration result is poor? Could you please reply if you see this?

liuqk3 commented 1 month ago

Which model do you use for this restoration? From my experience, the unconverged model (UQ-Transformer) will produce similar results with yours restored image. And the image you provided is quite different from the three datasets I used. You may fine-tune the model on the own dataset for better performance.