ZPdesu / SEAN

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020, Oral)
https://zpdesu.github.io/SEAN/
Other
652 stars 95 forks source link

question about results #8

Closed Ha0Tang closed 4 years ago

Ha0Tang commented 4 years ago

Hi, why your results in Table 2 (cityspaces and ade) different from those from SPADE paper while you used the same dataset train/test splits? For instance, results of SPADE on cityscapes are 62.3 mIoU, 81.9 accu and 71.8 FID, but you reported 57.88 mIoU, 93.59 accu and 50.38 FID, respectively.

ZPdesu commented 4 years ago

There are several reasons.

The results reported in SPADE only use segmentation masks as input.

The SPADE results reported in SEAN paper are tested on reconstruction images. Instance maps and style images(GT images) are used. The network are trained 50 epochs on cityscapes and ade.

Also, the author of SPADE did not provide the evaluation codes. They only mentioned where to download the model and pre-trained weights in their repo issues. We have retested these results and they are also different from what they reported in the paper. Therefore, we use the results of our fair comparison on these reconstruction task.

Ha0Tang commented 4 years ago

So, you need to input both 'Label' and 'Ground Truth' for generating your results in Fig. 6 of your paper, am I right?

ZPdesu commented 4 years ago

You are right.

Ha0Tang commented 4 years ago

The results of both Pix2pixHD and SPADE in Fig. 6 are also generated by using both 'Label' and 'Ground Truth'?

ZPdesu commented 4 years ago

Yes. Both of these frameworks have their own style encoders.

Ha0Tang commented 4 years ago

I see. Thanks.