fnzhan / UNITE

[CVPR 2022 Oral] Marginal Correspondence for Conditional Image Generation, [CVPR 2021] Unbalanced Feature Transport for Exemplar-based Image Translation
193 stars 27 forks source link

About data inputs #6

Open UdonDa opened 2 years ago

UdonDa commented 2 years ago

Hi @fnzhan !

Thank you for providing your nice implementation.

I have a question about inputs for networks, especially for a celeba edge case.

Correspondence predictor is given RGB images and seg_map (https://github.com/fnzhan/UNITE/blob/main/models/networks/correspondence.py#L200).

Celeb segmaps (15 channel) are created via a get_label_tensor function(https://github.com/fnzhan/UNITE/blob/main/data/celebahqedge_dataset.py#L77). It seems that celeba segmaps include not only an edge but also distanceTransformed images.

Why did you use additional information such as semantic maps? Do your work not work well for a dataset having no additional labels e.g. AFHQ -- animal face dataset?

Thanks.

fnzhan commented 2 years ago

Hi @UdonDa , thanks for your interest. The code inherits from the implementation of CoCosNet which includes other information for building correspondence. It is indeed somewhat different from the description in CoCosNet paper, we just follow the setting for fair comparison.

The original edge map is sparse which tends to be suboptimal for feature extraction, so a distanceTransformed image is included for more dense representation.

I don't check the performance on CelebA without other information, but I believe the performance will be degraded significantly under this setting.

UdonDa commented 2 years ago

Thank you for your quickly replying and telling me.

I know that the distanceTransformed image is significantly important. I'll try to learn a network without this information.

Thank you!