Thanks for sharing the code. I test GracoNet with your provided model and only obtain acc 0.740±0.005,FID 36.19 and LPIPS 0.205 on "evaluni" dataset. But I obtain acc 0.840±0.005 on "eval" dataset. Are these metrics obtained on different test data? But the paper describes "generates 10 composite images by randomly sampling 10 random vectors". Is it more reasonable to test all metrics in "evaluni" dataset?
@lingtianxia123 In our paper, we use "eval" dataset for testing generation plausibility (acc and FID) and "evaluni" dataset for testing generation diversity (LPIPS).
Thanks for sharing the code. I test GracoNet with your provided model and only obtain acc 0.740±0.005,FID 36.19 and LPIPS 0.205 on "evaluni" dataset. But I obtain acc 0.840±0.005 on "eval" dataset. Are these metrics obtained on different test data? But the paper describes "generates 10 composite images by randomly sampling 10 random vectors". Is it more reasonable to test all metrics in "evaluni" dataset?