Is Mask Actually Used in Inversion?

ChikaYan commented 1 year ago

Hi, thank you for the brilliant work!

May I just ask a quick question -- are the masks actually used during the PTI inversion? The code in projector_withseg.py seems to only read the image directly from the given path, without reading/using the provided masks at all, is this expected?

If so, may I ask if it is actually possible to utilize a mask during the inversion? At the moment, it seems that inversion would fail pretty badly if the image contains a large area of hair.

Thank you in advance!

SizheAn commented 1 year ago

Yeah I think if you just comment line https://github.com/SizheAn/PanoHead/blob/02073a42f169b4d3dd4e2450f846342e4c2d0525/projector_withseg.py#L320 and uncomment line https://github.com/SizheAn/PanoHead/blob/02073a42f169b4d3dd4e2450f846342e4c2d0525/projector_withseg.py#L319, change it to

dataset_kwargs = dnnlib.EasyDict(class_name='training.dataset.ImageFolderDataset', path=target_img, use_labels=True, max_size=None, xflip=False)

, the pti should still work.

Using the mask won't solve this problem ultimately, IMO. Inversion itself does not fail on finding the closest latent as you can see almost all the reconstruct images for frontal faces are high quality still. The problem occurs when we change the camera pose to side/back, which means the pretrained model's learned 3D prior is not good/generalizable enough. This is just my opinion, feel free to do something with masks and let us know! :)

ChikaYan commented 1 year ago

Thx a lot! Yeah it makes sense

SizheAn / PanoHead

Is Mask Actually Used in Inversion? #15