amjltc295 / Free-Form-Video-Inpainting

Official Pytorch implementation of "Learnable Gated Temporal Shift Module for Deep Video Inpainting. Chang et al. BMVC 2019." and the FVI dataset in "Free-form Video Inpainting with 3D Gated Convolution and Temporal PatchGAN, Chang et al. ICCV 2019"
https://arxiv.org/abs/1907.01131
335 stars 53 forks source link

question on evaluate.py #23

Closed kinfeparty closed 4 years ago

kinfeparty commented 4 years ago

Hello,I have met a bug on evaluate.py I read the script_usage.md and I have already run the code and get the result and I want to use the evaluate.py to evaluate other result. So I use the original result as a experiment,but get some bugs.

python evaluate.py -rgd ../dataset/test_20181109/JPEGImages -rmd ../dataset/random_masks_vl20_ns5_object_like_test/ -rrd saved/VideoInpaintingModel_v0.3.0_l1_m+a_mvgg_style_1_6_0.05_120_0_0_all_mask/0102_214744/test_outputs -mask I found that The code"-mask" doesn't exist ,is it possible that you modify the code? And I meet another bug that I can't read the result_image.

5R5DQMJ )IR ~B3Q8{CJW_J

I'd appreciate it if you could answer my questions.

amjltc295 commented 4 years ago

That doc could be outdated. @Nash2325138 could you take a look?

kinfeparty commented 4 years ago

@amjltc295 Hello, I have a new question about evaluation, I do some evaluation on video inpainting. My input frames size is 256256. My input mask is 100100 in the center of the frames. About 15% of the size of frames. I found that the value of PSNR is really high, which is over 36. But the paper I read about image inpainting the value of PSNR in their dataset almost between 25 and 30. Such as this table. N4%9})COI1$B@~ZU2}U3%XS I read the code of evaluation. I don't mean that your code is wrong, I just what to find why the value is so high, I white the issue because I think that maybe you can help me.

amjltc295 commented 4 years ago

@amjltc295 Hello, I have a new question about evaluation, I do some evaluation on video inpainting. My input frames size is 256_256. My input mask is 100_100 in the center of the frames. About 15% of the size of frames. I found that the value of PSNR is really high, which is over 36. But the paper I read about image inpainting the value of PSNR in their dataset almost between 25 and 30. Such as this table. N4%9})COI1$B@~ZU2}U3%XS I read the code of evaluation. I don't mean that your code is wrong, I just what to find why the value is so high, I white the issue because I think that maybe you can help me.

I'm not sure what you did for the evaluation, there may be some bugs if the PSNR is a lot higher than other methods. Also we did not train our models with bounding boxes, so if you measure the results with bounding boxes, it may be worse (in our paper, we don't have a lot improvement for MSE as well.) Other than wrong code/evaluation process, there is a trade-off between perceptual quality and distortion measures (see http://openaccess.thecvf.com/content_cvpr_2018/papers/Blau_The_Perception-Distortion_Tradeoff_CVPR_2018_paper.pdf) In the inpainting task we focus on perceptual quality more (with the help of GAN) but may lose the distortion scores (recover videos different from original ones).