Unexpected results - Githubissues

xiankgx commented 3 years ago

Dear @jingyuanli001 , I tried to train a model using the places2 dataset. After I have trained for approximately 750,000 iterations (batch size 9 using 3 GPUs; size 384), then I extracted the checkpoint and test it on the coco dataset. I'm getting unsatisfactory results. Why do you think there are obvious "lime water" effects around the masked regions?

The testing masks and the training masks are the same.

comp_0 fake_0 masked_0 comp_1 fake_1 masked_1 comp_2 fake_2 masked_2 comp_3 fake_3 masked_3 comp_4 fake_4 masked_4

jingyuanli001 commented 3 years ago

Hi, have you fine-tuned the model following the procedure in the readme file? The distorted color is very likely to come from inadequate fine-tuning. Fine-tuning is very very important here because during fine-tuning, we stopped updating the parameters for batch-normalization which is not stable due to the unpredictable mask region.

xiankgx commented 3 years ago

I have not started the finetuning stage yet. Ok, I will follow your instructions to finetune soon. Thank you so much for your swift response and help.

xiankgx commented 3 years ago

Dear @jingyuanli001 ,

This is what I got when saving the batch every 200 iterations during training. So, training results looks good, but testing results is totally different. So, you could be right in the batchnorm parameters being unstable. I will start finetune soon and update you.

input comp

jingyuanli001 commented 3 years ago

Dear @jingyuanli001 ,

This is what I got when saving the batch every 200 iterations during training. So, training results looks good, but testing results is totally different. So, you could be right in the batchnorm parameters being unstable. I will start finetune soon and update you.

Yes, this is what we also observed when we were developing our method. During training, the statistic used to normalize the feature map is calculated dynamically based on the instances in the batch, and then it can be considered to be (almost) accurate. However, during testing, the statistic used for normalization is the one recorded during training, and is not calculated from the testing images, and thus it is inaccurate. This then leads to the bad results. So the fine-tuning procedure is actually asking the model to "get familiar with" the fixed statistics that will be used for testing. In fact, I personally consider the batch normalization scheme here to be suboptimal and there should have some better normalization schemes.

xiankgx commented 3 years ago

Dear @jingyuanli001 , you were obsolutely right! Fixing the batchnorm parameters helped solved even with just 10k iterations. The following are testing samples at 780,000 iterations. Thank you so much for your help, and good job in the paper. I really like the idea of working inwards from the border with partial convolution and recurrence.

comp_0 fake_0 masked_0 comp_1 fake_1 masked_1 comp_2 fake_2 masked_2

07hyx06 commented 3 years ago

Thanks for your code. What's the effect of iteration time? I set iter=15 and obtain the following strange results. When iter=6 (default iter in RFRNet.py) the results are good.

masked_img_1

htzheng commented 2 years ago

Hi @xiankgx, nice job! Since the official Places2 model is not released yet, I wonder if by any chance you are able to share the Place2 weights so that we could make a fair comparison with RFR. Thank you a lot! :)

jingyuanli001 / RFR-Inpainting

Unexpected results #33