XingangPan / DragGAN

Official Code for DragGAN (SIGGRAPH 2023)
https://vcai.mpi-inf.mpg.de/projects/DragGAN/
Other
35.65k stars 3.44k forks source link

Wrong result #414

Open Yurains opened 8 months ago

Yurains commented 8 months ago

Thank you for your outstanding work I am training my own model using StyleGAN2 ada pytorch and importing other photos with PTI but I encountered an "AssertionError: Wrong size for dimension 1: got 18, expected 12" issue This seems to be a dimension-related problem, but I'm not sure how to resolve it Is there a way to make the necessary changes?

PDillis commented 7 months ago

Since I haven't used PTI, I can tell you where that error comes from and how to find where the code fails: in StyleGAN1/2, the mapping network $f$ or G.mapping will take a random latent z ($z\in\mathbb{R}^{512}$) and will output a disentangled latent w ($w\in\mathbb{R}^{1\times n\times512}$); for unconditional models, you simply do w = G.mapping(z, None). The disentangled latent w is the one you wish to find to do the editing with DragGAN (using either simple inversion or PTI), whose dimension $n$ will depend on the image resolution of your dataset/size of images that will be generated.

Concretely, StyleGAN expects two sections of the disentangled latent per block resolution in the synthesis network $g$ or G.synthesis (which starts from 4 and goes up by powers of 2 up until your final output resolution; more info in the StyelGAN architecture). So, from the AssertionError you posted above, it seems like PTI is giving you a disentangled latent of shape [1, 18, 512] whereas the network you are training is expecting a disentangled latent of shape [1, 12, 512]. In other words, PTI has hard-coded an image resolution of 1024 ($n=18$) whereas your StyleGAN2 model has a resolution of 128 ($n=12$).

I could be wrong and be the other way, so it's always helpful to tell us which code you ran and which line gave the AssertionError above, otherwise all we can do is guess.

Yurains commented 6 months ago

@PDillis Sorry for replying to you now. Thank you very much for your reply. I tried to combine PTI with DranGAN and determined the default model he wanted to use. 1

This is the official default model, there is no problem.

Here I make sure my dimensions are correct and use the specified [1,18,512], but this error still occurs 螢幕擷取畫面 2024-03-04 035801

Yurains commented 6 months ago

@PDillis If you need the code, I can mail it to you thank