clovaai / tunit

Rethinking the Truly Unsupervised Image-to-Image Translation - Official PyTorch Implementation (ICCV 2021)
Other
458 stars 48 forks source link

FFHQ results #33

Closed sunwoo76 closed 2 years ago

sunwoo76 commented 2 years ago

Hello :)

I have a question about figure 4. in this paper.

In the paper, the ffhq results were generated by using averaged style vectors of each domain.

However, I checked that the style vector is generated globally. There is only one style vector that is shared by all the domains.

Could you explain about "averaged style vectors of each domain"?

Thank you :)

FriedRonaldo commented 2 years ago

If it is right for referring to "Figure 4: Cross-domain attribute translation using 0.1% of labeled samples", I manually labeled some images for each attribute (glasses, aging, gender, hair color), then, trained TUNIT in a semi-supervised manner.

For "glasses" as an example, I labeled 35 images as "glasses" and another 35 images as "no glasses". Then, there are two domains (glasses & no glasses), therefore, we can train TUNIT like a cross-domain translation model in the semi-supervised learning with an extremely small number of labeled samples.

Because it becomes a sort of a cross-domain translation problem, we have two domains. So we can calculate the average vector of each domain and conduct the image translation using the average vector.

Also, please note that the result of FFHQ in Figure 5. is different from that of Figure 4. (Figure 5. : unsupervised / Figure 4. : semi-supervised)


Actually, you can conduct the translation with the average style vector of TUNIT trained in an unsupervised manner. Because there are K domains constructed by the clustering, then we can compute the avg. vector of each domain and use it for translating images.