do we need to separate content and style image during training？

emergencyd commented 2 years ago

It seems that during training, we use the same dataloader for content and style image generation. Say we have class A and class B, there is a chance that we have a content image from A and a style image from A, too. Is that resonable?

Also, how to define the "num_classes" in the discriminator? Is it equal to the number of classes in the whole training set? What if I use different classes in content data loader and style data loader?

mingyuliutw commented 2 years ago

Re: there is a chance that we have a content image from A and a style image from A, too. Is that reasonable?

Yes, this is intended. Let's say the two images are two husky dogs with two different colors. This will be simply a same-domain translation task. It is still a valid translation task, just a bit simpler. But I believe there are cases where we want to separate the content dataset and style dataset.

Re: Also, how to define the "num_classes"

This should be the number of classes in the style datasets. This parameter is used by the discriminator.

We have a new implementation for the repo that we are actively maintaining. You are welcome to take a look https://github.com/NVlabs/imaginaire

emergencyd commented 2 years ago

https://github.com/NVlabs/FUNIT/blob/198f430b6ef1abe940251b30f51d5cc68a476787/funit_model.py#L39 But it seems that the content image will be the input of the discriminator. So I suppose num_classes = set(class_content + class_style). Otherwise, the content and style classes probably would share the same classifier. Say content class A and style class B. If we set num_classes=1, then the output channel is only 1. @mingyuliutw

NVlabs / FUNIT

do we need to separate content and style image during training？ #50