Is the Q_phi an identity mapping?

Hi, thanks for your question!

Our work provides a framework to incorporate various objectives for supervising personalized T2I generation models.

Regarding your questions:

The goal of ‘Q_phi,’ which denotes the reward model, is to predict the reward and act as a differentiable function to optimize the framework.
About the design: 1). We first reformulate the current T2I personalization model (e.g., DreamBooth). In this context, ‘Q_phi’ serves as an identity mapping. 2). The code currently released is the implementation of ‘Look Forward.’ It can be directly used as the loss function to train the model, and it works well. Here, we integrated it into our DPG framework, where it functions also effectively. 3). Our framework can handle complex supervisory signals, such as ‘DINO’ similarity (as mentioned in our paper) and human face similarity. Since I’ve just graduated and am busy with work, the code may be released during October 1st and October 7th. Apologies for the delay.

If you have any other questions, feel free to contact me.

wfanyue / DPG-T2I-Personalization