bytedance / ImageDream

The code releasing for https://image-dream.github.io/
Apache License 2.0
721 stars 32 forks source link

[Question] Could You Provide More Details About Pixel Controller? #8

Open BlingHe opened 8 months ago

BlingHe commented 8 months ago

Hi, the authers, thanks for sharing this great work!

I have some questions about the Pixel Controller.

As you mentioned in the paper, the pixel controller is responsible for integrating the fine-grained object appearance, the image latent of the reference image (named "ip_img" in your code) is introduced by concatenating it with other view images along the "batch" dimension. To the best of my knowledge, the new added "batch item" still need "noise" to achieve mse loss. Could you give more explanation about how to train this pixel controller module?

I will be very grateful for your. Looking forward to your reply!

haodong2000 commented 8 months ago

I am curious about it, also. Unlike Wonder3D and MVDream that leverages multi-view attention without an explicit reference image latent, this pixel controller does need an appropriate process about the predicted noise over ref img latent during training.

pengwangucla commented 8 months ago

It is very simple, We may just ignored the loss over the last frame.