The results were relatively bad? Why?

liuqk3 / PUT

Paper 'Transformer based Pluralistic Image Completion with Reduced Information Loss' in TPAMI 2024 and 'Reduce Information Loss in Transformers for Pluralistic Image Inpainting' in CVPR2022

MIT License

173 stars 15 forks source link

The results were relatively bad? Why? #35

Open lzwRover321 opened 3 months ago

lzwRover321 commented 3 months ago

Thanks for the excellent code, when I inference the picture. gt: crop mask: crop Lama result:(The result of Lama https://github.com/advimman/lama) res3 PUT result:(The result of model tpami2024_vit_base_naturalscene_res512) crop python scripts/inference.py --func inference_inpainting \ --name OUTPUT/tpami2024_vit_base_naturalscene_res512/checkpoint/000599e_742800iter.pth \ --input_res 512,512 \ --num_token_per_iter -1 \ --num_token_for_sampling 1 \ --num_sample 1 \ --num_replicate 1 \ --image_dir data/naturalscene_512_sample/gt \ --mask_dir data/naturalscene_512_sample/mr0.5_0.6 \ --gpu 0

liuqk3 commented 3 months ago

For the model trained on Places, You can try the following two commands:

python scripts/inference.py --func inference_inpainting
--name OUTPUT/tpami2024_vit_base_naturalscene_res512/checkpoint/000599e_742800iter.pth
--input_res 512,512
--num_token_per_iter 20
--num_token_for_sampling 200
--num_sample 1
--num_replicate 1
--image_dir data/naturalscene_512_sample/gt
--mask_dir data/naturalscene_512_sample/mr0.5_0.6
--gpu 0

# or

python scripts/inference.py --func inference_inpainting
--name OUTPUT/tpami2024_vit_base_naturalscene_res512/checkpoint/000599e_742800iter.pth
--input_res 512,512
--num_token_per_iter -1
--num_token_for_sampling 20
--num_sample 1
--num_replicate 1
--image_dir data/naturalscene_512_sample/gt
--mask_dir data/naturalscene_512_sample/mr0.5_0.6
--gpu 0

lzwRover321 commented 3 months ago

naturalscene_512_sample

Thank you for your reply, but I still haven't got a good result.

liuqk3 commented 3 months ago

The mask should be in PNG format. The pixel values in mask should be 0 or 255. The JPEG mask image will introduce artifacts.

lzwRover321 commented 3 months ago

The mask should be in PNG format. The pixel values in mask should be 0 or 255. The JPEG mask image will introduce artifacts.

Okay, so the PNG mask is a single channel. I got better results, but the edges don't seem to transition smoothly. Great work, thanks again!