Closed Ha0Tang closed 4 years ago
There are several reasons.
The results reported in SPADE only use segmentation masks as input.
The SPADE results reported in SEAN paper are tested on reconstruction images. Instance maps and style images(GT images) are used. The network are trained 50 epochs on cityscapes and ade.
Also, the author of SPADE did not provide the evaluation codes. They only mentioned where to download the model and pre-trained weights in their repo issues. We have retested these results and they are also different from what they reported in the paper. Therefore, we use the results of our fair comparison on these reconstruction task.
So, you need to input both 'Label' and 'Ground Truth' for generating your results in Fig. 6 of your paper, am I right?
You are right.
The results of both Pix2pixHD and SPADE in Fig. 6 are also generated by using both 'Label' and 'Ground Truth'?
Yes. Both of these frameworks have their own style encoders.
I see. Thanks.
Hi, why your results in Table 2 (cityspaces and ade) different from those from SPADE paper while you used the same dataset train/test splits? For instance, results of SPADE on cityscapes are 62.3 mIoU, 81.9 accu and 71.8 FID, but you reported 57.88 mIoU, 93.59 accu and 50.38 FID, respectively.