LTH14 / rcg

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701
MIT License
785 stars 36 forks source link

Concern about class-unconditional #11

Closed impiga closed 8 months ago

impiga commented 8 months ago

Dear authors, Thanks for open-sourcing your great work!!!

However, I have a question about the SSL encoder: I have noticed MoCo-v3 models are trained on ImageNet. Would they be influenced by the dataset's intrinsic class structure, leading to representations that mirror the 1,000 ImageNet categories? This might challenge the claim that RCG is class-unconditional.

Perhaps training MoCo-v3 on a non-class-specific dataset like Web images and then using it for RCG could clarify this.

Looking forward to your thoughts.

LTH14 commented 8 months ago

Thanks for your interest! The ultimate goal of RCG is to train all three modules (image encoder, RDM, pixel generator) on unlabeled large-scale datasets such as Web images. However, training Moco-v3 on datasets other than ImageNet will result in an unfair comparison with previous generation methods on ImageNet, as it introduces additional training data. Therefore, in the paper we follow the common setting where all three modules are trained on ImageNet.

As shown in the Moco v3 paper, it can learn a representation that achieves >75% linear probing accuracy on ImageNet. However, this does not violate the claim that RCG is class-unconditional. The definition of class-unconditional generation is that it does not use class label information during both training and inference, and can thus generate images from all classes in a dataset. Although Moco-v3 is trained on ImageNet, it does not use any class label, and thus follows the class-unconditional generation assumption. The training and generation of RCG also do not involve class labels. Instead, it conditions on Moco v3 representations to provide high-level guidance for the generation process. As shown in Figure 5 of the paper, the generation results are diverse and represent different classes, further proving that RCG is class-unconditional.